## 电子工程代写|并行计算代写Parallel Computing代考|CSE179

statistics-lab™ 为您的留学生涯保驾护航 在代写并行计算Parallel Computing方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写并行计算Parallel Computing方面经验极为丰富，各种代写并行计算Parallel Computing相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 电子工程代写|并行计算代写Parallel Computing代考|The Three Layers of Parallelism

There used to be a time where the increase of transistors materialised primarily in new compute features and, at the same time, it paired up with higher and higher frequencies. Nowadays, we still see an increase of the transistor count, but the frequency (speed of individual transistors) stagnates or is even reduced.

With Moore’s law continuing to hold and a break-down of Dennard scaling (frequency cannot be increased anymore at a given power envelope per transistor), there is a “new” kid on the block that helps us to build more powerful computers. This one eventually allows Computational $\mathrm{X}$ to run more challenging simulations. Actually, there are three new kids around that dominate code development today (Fig. 2.2). However, they all are flavours of one common pattern:

1. The parallelism in the computer increases, as modern computers still can do one addition or multiplication or …on one or two pieces of data in one (abstract) step. ${ }^{2}$ They apply with the same operation to a whole vector of entries in one rush. We call this vector parallelism.
2. The parallelism in the computer increases, as modern CPUs do not only host one core but an ensemble of cores per chip. Since this multicore idea implies that all cores share their memory we call this shared memory parallelism.
3. The parallelism in the computer increases, as modern modern supercomputers consist of thousands of compute nodes. A compute nodes is our term for a classic computer which “speaks” to other computers via a network. Since the individual nodes are independent computers, they do not share their memory. Each one has memory of its own. We therefore call this distributed memory parallelism.

The three levels of parallelism have different potential to speed up calculations. This potential depends on the character of the underlying calculations as well as on the hardware, while the boundaries in-between the parallelism flavours are often blurred.

## 电子工程代写|并行计算代写Parallel Computing代考|The N-Body Problem

Given are $N$ bodies (planets, e.g.) out there in space. They interact with each other through gravity. Let body 1 be described by its position $p_{1}=\left(p_{x, 1}, p_{y, 1}, p_{z, 1}\right)^{T} \in$ $\mathbb{R}^{3}$ (a vector) and its mass $m_{1}$. Furthermore, it has a certain velocity $v_{1}=$ $\left(v_{x}, 1, v_{y, 1}, v_{z, 1}\right)^{T} \in \mathbb{R}^{3}$ which is a vector again. Body 2 is defined analogously. Body 1 experiences a force
$$F_{1}=G \frac{m_{1} \cdot m_{2}}{\left|p_{1}-p_{2}\right|{2}^{2}} \cdot \frac{\left(p{2}-p_{1}\right)}{\left|p_{1}-p_{2}\right|{2}} .$$ If there are more than two objects, then Body 1 also gets a contribution from Body 3,4 , and so forth. The forces simply sum up. We furthermore know that \begin{aligned} &\partial{t} v_{1}(t)=\frac{F_{1}}{m_{1}} \text { and } \ &\partial_{t} p_{1}(t)-v_{1}(t) . \end{aligned}
This is a complete mathematical model of the reality. It highlights in (3.2) and (3.3) that velocity and position of our object depend on time. These equations are our whole theory of how the world out there in space behaves (cmp. Chap. 1) in a Newton sense. Einstein has later revised this model. ${ }^{2}$

The expressions $\partial_{t} y(t)=\frac{\partial}{\partial t} y(t)$ both denote the derivation of a function $y(t)$. Often, we drop the ( $t$ ) parameter-we have already done so for $F$ above. For the second derivative, there are various notations that all mean the same: $\partial_{t} \partial_{t} y=\partial_{t t} y=\frac{\partial \partial}{\partial t \partial t} y$. I often use $\partial_{t}^{(2)}$. This notation makes it easy to specify arbitrary high derivatives.Besides $\partial_{t}, \mathrm{I}$ also use $\mathrm{d}{t}$. This is in line with a lot of literature in mathematics and physics. There is no difference between the two of them as long as we deal with a plain function $f(t)$ only. However, if we have an $f(t, x(t))$ with two arguments where both arguments depend on $t \longrightarrow$ one of them is the $t$, the other one accepts $t$ as argument-then $\partial{t}$ is the derivative where we alter the direct $t$ argument only. $\partial_{x}$ or $\partial_{x(t)}$ is the derivative w.r.t. the second variable. They both are partial derivatives. They look “in one direction”.

## 电子工程代写|并行计算代写Parallel Computing代考|The Three Layers of Parallelism

1. 计算机中的并行性增加了，因为现代计算机仍然可以在一个（抽象）步骤中对一个或两个数据进行一次加法或乘法运算或……。2它们以相同的操作一次性应用于整个条目向量。我们称之为向量并行。
2. 计算机中的并行性增加了，因为现代 CPU 不仅拥有一个内核，而且每个芯片拥有一组内核。由于这种多核思想意味着所有内核共享它们的内存，我们称之为共享内存并行。
3. 随着现代现代超级计算机由数千个计算节点组成，计算机中的并行性增加了。计算节点是我们对经典计算机的术语，它通过网络与其他计算机“对话”。由于各个节点是独立的计算机，它们不共享它们的内存。每个人都有自己的记忆。因此，我们称这种分布式内存并行。

## 电子工程代写|并行计算代写Parallel Computing代考|The N-Body Problem

$$F_{1}=G \frac{m_{1} \cdot m_{2}}{\left|p_{1}-p_{2}\right| 2^{2}} \cdot \frac{\left(p 2-p_{1}\right)}{\left|p_{1}-p_{2}\right| 2} .$$

$$\partial t v_{1}(t)=\frac{F_{1}}{m_{1}} \text { and } \quad \partial_{t} p_{1}(t)-v_{1}(t) .$$

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 电子工程代写|并行计算代写Parallel Computing代考|CSC267

statistics-lab™ 为您的留学生涯保驾护航 在代写并行计算Parallel Computing方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写并行计算Parallel Computing方面经验极为丰富，各种代写并行计算Parallel Computing相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 电子工程代写|并行计算代写Parallel Computing代考|Moore’s Law

For decades, computational scientists have been in a comfortable situation: They wrote code and made this code fast on a particular architecture. Everybody knew that architectures evolve kind of continuously, i.e. with every new generation of machines “old”ish codes ran faster, too. They might not benefit from the latest hardware features, but there was a performance improvement. Some people claim this were Moore’s law, which is not correct. Let’s revisit this “law”‘ :

Gordon Moore, one of the co-founders of Intel, observed that the cost to put transistors onto a chip decreases if we squeeze more transistors on the circuit. From a certain point on, however, the manufacturing cost rises again, since the integration of all the transistors becomes expensive. Consequently, there’s a sweet spot: a magic number of transistors per chip where the chip is most profitable. Moore observed that the “complexity for minimum component costs has increased at a rate of roughly a factor of two per year”. So the number of transistors on a chip around the sweet spot grows exponentially according to this law. The manufacturing sweet spot moves and therefore vendor designs move with the spot.

Intel’s executive David House later corrected the statement-to 18 months-so it is even more aggressive, while Carver Mead from CalTech coined the term “Moore’s Law”. Today, the law continues to hold though the rate of the increase has slowed down (Fig. 2.1).For simulation codes as we have sketched them before, it is not directly clear why the transistor count makes a difference. We are interested in speed. However, there is a correlation: First, vendors use the opportunity to have more transistors to allow the computer to do more powerful things. A computer architecture provides some services (certain types of calculations). With more transistors, we can offer more of these calculation types, i.e. broaden the service set. Furthermore, vendors use the opportunity to squeeze more cores onto the chip. Finally, the more of transistors historically did go hand in hand with a shrinkage of the transistors.

## 电子工程代写|并行计算代写Parallel Computing代考|Dennard Scaling

Definition $2.1$ (Dennard scaling) The power cost $P$ to drive a transistor follows roughly
$$P=\alpha \cdot C F V^{2}$$
This law is called Dennard scaling.
$C$ is the capacitance, i.e. encodes the size of the individual transistors. I use $C$ here for historic reasons. In the remainder of this manuscript, $C$ is some generic constant without a particular meaning. $F$ is the frequency and $V$ the voltage. $\alpha$ is some fixed constant so we can safely skip it in our follow-up discussions. Note that the original Dennard scaling ignores that we also have some leakage. Leakage did not play a major role when the law was formulated in $1974 .$

Dennard’s scaling law is all about power. For both chip designers and computing centres buying and running chips, controlling the power envelope of a chip is a sine qua non, as

• buying power is expensive, and as
• a chip “converts” power into heat. To get the heat out of the system again requires even more power to drive fans, pumps and cooling liquids. But if we don’t get it out of the system on time, the chip will eventually melt down.

While we want to bring the power needs down, we still want a computer to be as capable as possible. That means, it should be able to do as many calculations per seconds as possible. The Dennard scaling tells us that we have only three degrees of freedom:

1. Reduce the voltage. This is clearly the gold solution as the $V$ term enters the equation squared. Reducing the voltage however is not trivial: If we reduce it too much, the transistors don’t switch reliably anymore. As long as we need a reliable chip, i.e. a chip that always gives us the right answer, we work already close to the minimum voltage limit with modern architectures.
2. Reduce the transistor size. Chip vendors always try to decrease transistor sizes with the launch of most new chip factories or assembly lines. Unfortunately, this option now is, more or less, maxed out. You can’t go below a few atoms. A further shrinkage of transistors means that the reliability of the machine starts to suffer-we ultimately might have to add additional transistors to handle the errors which once more need energy. Most importantly, smaller chips are more expensive to build (if they have to meet high quality constraints) which makes further shrinking less attractive.
3. Reduce the frequency. If we reduce the frequency, we usually also get away with a slightly lower voltage, so this amplifies the savings effect further. However, we want to have a faster transistor, not a slower one!

## 电子工程代写|并行计算代写Parallel Computing代考|Dennard Scaling

C是电容，即编码单个晶体管的大小。我用C出于历史原因。在这份手稿的其余部分，C是一些没有特定含义的通用常数。F是频率和在电压。一个是一些固定常数，因此我们可以在后续讨论中安全地跳过它。请注意，原始的 Dennard 缩放忽略了我们也有一些泄漏。在制定法律时，泄漏并没有起主要作用1974.

• 购买力是昂贵的，并且作为
• 芯片将功率“转换”为热量。要再次将热量从系统中排出，需要更多的功率来驱动风扇、泵和冷却液。但如果我们不按时将其从系统中取出，芯片最终会熔化。

1. 降低电压。这显然是黄金解决方案在项进入方程的平方。然而，降低电压并非易事：如果我们降低太多，晶体管将不再可靠地切换。只要我们需要一个可靠的芯片，即总是给我们正确答案的芯片，我们的工作就已经接近现代架构的最低电压限制。
2. 减小晶体管尺寸。随着大多数新芯片工厂或装配线的推出，芯片供应商总是试图减小晶体管尺寸。不幸的是，这个选项现在或多或少地被最大化了。你不能低于几个原子。晶体管的进一步缩小意味着机器的可靠性开始受到影响——我们最终可能不得不添加额外的晶体管来处理再次需要能量的错误。最重要的是，较小的芯片制造成本更高（如果它们必须满足高质量的限制），这使得进一步缩小的吸引力降低。
3. 减少频率。如果我们降低频率，我们通常也会使用稍低的电压，因此这会进一步放大节能效果。但是，我们想要更快的晶体管，而不是更慢的晶体管！

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 电子工程代写|并行计算代写Parallel Computing代考|CS525

statistics-lab™ 为您的留学生涯保驾护航 在代写并行计算Parallel Computing方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写并行计算Parallel Computing方面经验极为丰富，各种代写并行计算Parallel Computing相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 电子工程代写|并行计算代写Parallel Computing代考|A Third Pillar

The two pillars as we have introduced them so far are barely enough to explain how we advance in science and engineering today.

• Some experiments are economically infeasible. If we want to design the next generation of jumbo jets, we might want to put each model into a wind tunnel to see whether it does take off. But fuelling a wind tunnel (in particular on a reasonable scale) is extremely expensive.
• Some experiments are ethically inappropriate. If we design novel ways of radiation treatment, it would not be ethical to try this out with patients in a trial-and-error fashion. Another example: If we construct a new bridge, we don’t want the first cars driving over this bridge to be Guinea pigs.
• Some experiments are ecologically dubious. If we design novel nuclear reactors, we don’t want to rely on trial-and-error when it comes to security.
• Some experiments are by construction impossible. If we make up new theories about the Big Bang, we are lost with our two pillar model: we will likely never be able to run a small, experimental Big Bang.
• Some equations are so complex that we cannot solve them (analytically). For many setups, we have good mathematical models. Maybe, we can even make claims about the existence and properties of solutions to these models. But that does not always mean that the maths gives us a constructive answer, i.e. can tell us what the solution to our model is.

This list is certainly not comprehensive. Its last point is particularly intriguing in modern science: We have some complex equations comprising a term $u(x)$. Let this $u(x)$ be the quantity that we are in interested in. Yet, we cannot transform this formula into something written as $u(x)=\ldots$ with no $u(x)$ on the right-hand side. That is, even when we have $x$, we still do not know $u(x)$ directly.

## 电子工程代写|并行计算代写Parallel Computing代考|Computational X

We call the disciplines $\mathrm{X}$ that rely on computer simulations Computational $\mathrm{X}$. There’s Computational Engineering, Computational Physics, Computational Chemistry, Computational Biology, and so forth. We often use Computational Science and Engineering (CSE or CS\&E) as an umbrella term covering all the different flavours. Today, this is a little bit of an old-fashioned phrase, since we don’t want to exclude Computational Medicine, Computational Finances and so forth. They all share similar challenges. Lacking a better term, let’s stick to CSE.

A typical project in Computational $\mathrm{X}$ brings together expertise and skills from three traditional areas: the application discipline, Mathematics and Computer Science (Fig. 1.3). The term CSE thus covers a broad church of challenges:

1. The modelling of physics or application knowledge with mathematical equations;
2. the transcription of these equations into something (other equations) that a computer can solve. Often that means breaking down an infinite fine (continuous) model into a finite number of equations;
3. the analysis of these equations: do they have a solution, how reliable is an (approximate) solution, and so forth;
4. the design of an algorithms solving these equations;
5. the analysis of these algorithms: how expensive will it be to run them, i.e. what’s the algorithmic complexity, e.g.;
6. the coding of these algorithms;
7. the systematic testing of these codes;
8. the performance optimisation to make the codes run fast and scale on big machines;
9. the input and output data management;
10. the postprocessing ranging from visualisation to pattern searches within outputs;

11. This sequence and variants thereof are called the simulation pipeline. I give a first, simple example of the most prominent steps within this pipeline in Chap. 3. I personally dislike the term pipeline. It suggests-similar to the term waterfall in software development-some kind of sequentiality. In practice, we jump around and even make excursions into the theory and experiment world.

## 电子工程代写|并行计算代写Parallel Computing代考|A Third Pillar

• 有些实验在经济上是不可行的。如果我们想设计下一代大型喷气式飞机，我们可能希望将每个模型放入风洞中，看看它是否能起飞。但是为风洞加油（特别是在合理的规模上）非常昂贵。
• 有些实验在伦理上是不恰当的。如果我们设计出新颖的放射治疗方法，那么以试错的方式对患者进行尝试是不道德的。另一个例子：如果我们建造一座新桥，我们不希望第一辆驶过这座桥的汽车是豚鼠。
• 一些实验在生态上是可疑的。如果我们设计新颖的核反应堆，我们不想在安全方面依赖反复试验。
• 有些实验在构造上是不可能的。如果我们对大爆炸提出新的理论，我们就会迷失在我们的两个支柱模型中：我们可能永远无法进行一次小型的实验性大爆炸。
• 有些方程非常复杂，我们无法（分析地）求解它们。对于许多设置，我们有很好的数学模型。也许，我们甚至可以声称这些模型的解决方案的存在和属性。但这并不总是意味着数学会给我们一个建设性的答案，即可以告诉我们模型的解决方案是什么。

## 电子工程代写|并行计算代写Parallel Computing代考|Computational X

1. 用数学方程对物理或应用知识进行建模；
2. 将这些方程转录成计算机可以求解的东西（其他方程）。这通常意味着将无限精细（连续）模型分解为有限数量的方程；
3. 对这些方程的分析：它们是否有解，（近似）解的可靠性等；
4. 设计求解这些方程的算法；
5. 对这些算法的分析：运行它们的成本是多少，即算法的复杂性是多少，例如；
6. 这些算法的编码；
7. 对这些代码进行系统测试；
8. 性能优化，使代码在大型机器上快速运行并扩展；
9. 输入输出数据管理；
10. 从可视化到输出中的模式搜索的后处理；

11. 这个序列及其变体被称为模拟管道。我在第一章中给出了这个管道中最突出的步骤的第一个简单示例。3. 我个人不喜欢管道这个词。它暗示了——类似于软件开发中的术语瀑布——某种顺序性。在实践中，我们跳来跳去，甚至进入理论和实验世界。

## 广义线性模型代考

statistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。