## 计算机代写|python代考|PYTHON101

Python是一种高级的、解释性的、通用的编程语言。它的设计理念强调代码的可读性，使用大量的缩进。

Python是动态类型的，并且是垃圾收集的。它支持多种编程范式，包括结构化（特别是程序化）、面向对象和函数式编程。由于其全面的标准库，它经常被描述为一种 “包含电池 “的语言。

statistics-lab™ 为您的留学生涯保驾护航 在代写python方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写python代写方面经验极为丰富，各种代写python相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 计算机代写|python代考|HTTP objects filter

As we can see, the filters provide us with a great traceability of communications and also serves as an ideal complement to analyze a multitude of attacks. An example of this is the http. content_type filter, thanks to which we can extract different data flows that take place in an HTTP connection (text/htm1, application/zip, audio/mpeg, image/gif). This will be very useful for locating malware, exploits, or other types of attacks that are embedded in such a protocol:

Wireshark contemplates two types of filters, that is, capture filters and display filters:

• Capture filters are those that are set to show only packets that meet the requirements indicated in the filter
• Display filters establish a filter criterion on the captured packages, which we are visualizing in the main screen of Wireshark

The visualization filters establish a criterion of filter on the packages that we are capturing and that we are visualizing in the main screen of Wireshark. When you apply a filter on the Wireshark main screen, only the filtered traffic will appear through the display filter. We can also use it to filter the content of a capture through a pcap file:

We can use the pyshark library to analyze the network traffic in Python, since everything Wireshark decodes in each packet is made available as a variable. We can find the source code of the tool in GitHub’s repository: https://github. com/Kimin ewt/pyshark.

In the PyPI repository, we can find the last version of the library, that is, https://p ypi.org/project/pyshark, and we can install it with the pip install pyshark command.

## 计算机代写|python代考|FileCapture and LiveCapture in pyshark

As we saw previously, you can use the filecapture method to open a previously saved trace file. You can also use pyshark to sniff from an interface in real time with the Livecapture method, like so:

Once a capture object is created, either from a Livecapture or Filecapture method, several methods and attributes are available at both the capture and packet level. The power of pyshark is that it has access to all of the packet decoders that are built into TShark.
Now, let’s see what methods provide the returned capture object.
To check this, we can use the dir method with the capture object:

Both methods offer similar parameters that affect packets that are returned in the capture object. For example, we can iterate through the packets and apply a function to each. The most useful method here is the apply_on_packets() method. apply_on_packets() is the main way to iterate through the packets, passing in a function to apply to each packet:

This option makes capture file reading much faster, and with the dir method, we can check the attributes that are available in the object to obtain information about a specific packet.

In this chapter, we have completed an introduction to TCP/IP and how machines communicate in a network. We learned about the main protocols of the network stack and the different types of address for communicating in a network. We started with Python libraries for network programming and looked at socket and the ur111ib and requests modules, and provided an example of how we can interact and obtain information from RFC documents. We also acquired some basic knowledge so that we are able to perform a network traffic analysis with Wireshark.

# python代写

## 计算机代写|python代考|HTTP objects filter

Wireshark 考虑了两种类型的过滤器，即捕获过滤器和显示过滤器：

• 捕获过滤器是那些设置为仅显示满足过滤器中指示要求的数据包的过滤器
• 显示过滤器在捕获的包上建立过滤标准，我们在 Wireshark 的主屏幕中可视化

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 计算机代写|python代考|СP5805

Python是一种高级的、解释性的、通用的编程语言。它的设计理念强调代码的可读性，使用大量的缩进。

Python是动态类型的，并且是垃圾收集的。它支持多种编程范式，包括结构化（特别是程序化）、面向对象和函数式编程。由于其全面的标准库，它经常被描述为一种 “包含电池 “的语言。

statistics-lab™ 为您的留学生涯保驾护航 在代写python方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写python代写方面经验极为丰富，各种代写python相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 计算机代写|python代考|Capturing packets with Wireshark

To start capturing packets, you can click on the name of an interface from the list of interfaces. For example, if you want to capture traffic on your Ethernet network, double-click on the Ethernet connection interface:

As soon as you click on the name of the interface, you will see that the packages start to appear in real time. Wireshark captures every packet that’s sent to or from your network traffic. You will see random flooding of data in the Wireshark dashboard. There are many ways to filter traffic:

• To filter traffic from any specific IP address, type $i p$.addr $==’ x x x . x x . x x . x x^{\prime}$ in the Apply a display filter field
• To filter traffic for a specific protocol, say, TCP, UDP, SMTP, ARP, and DNS requests, just type the protocol name into the Apply a display filter field

We can use the Apply a display filter box to filter traffic from any IP address or protocol:

The graphical interface of Wireshark is mainly divided into the following sections:

• The toolbar, where you have all the options that you can perform on the pre and post capture
• The main toolbar, where you have the most frequently used options in Wireshark
• The filter bar, where you can apply filters to the current capture quickly
• The list of packages, which shows a summary of each package that is captured by Wireshark
• The panel of details of packages that, once you have selected a package in the list of packages, shows detailed information of the same
• The packet byte panel, which shows the bytes of the selected packet, and highlights the bytes corresponding to the field that’s selected in the packet details panel
• The status bar, which shows some information about the current state of Wireshark and the capture

## 计算机代写|python代考|Network traffic in Wireshark

Network traffic or network data is the amount of packets that are moving across a network at any given point of time. The following is a classical formula for obtaining the traffic volume of a network: Traffic volume = Traffic Intensity or rate * Time

In the following screenshot, we can see what the network traffic looks like in Wireshark:

In the previous screenshot, we can see all the information that is sent over, along with the data packets on a network. It includes several pieces of information, including the following:

• Time: The time at which packets are captured
• Source: The source from which the packet originated
• Destination: The sink where packets reach their final destination
• Protocol: Type of IP (or set of rules) the packet followed during its journey, such as TCP, UDP, SMTP, and ARP
• Info: The information that the packet contains
The Wireshark website contains samples for capture files that you can import into Wireshark. You can also inspect the packets that they contain: https://wiki.wi reshark.org/samplecaptures.

When you start capturing packets, Wireshark uses colors to identify the types of traffic that can occur, among which we can highlight green for TCP traffic, blue for DNS traffic, and black for traffic that has errors at the packet level.

To see exactly what the color codes mean, click View | Coloring rules. You can also customize and modify the coloring rules in this screen.

If you need to change the color of one of the options, just double-click it and choose the color you want.

# python代写

## 计算机代写|python代考|Capturing packets with Wireshark

• 要过滤来自任何特定 IP 地址的流量，请键入一世p.地址==′XXX.XX.XX.XX′在应用显示过滤器字段中
• 要过滤特定协议（例如 TCP、UDP、SMTP、ARP 和 DNS 请求）的流量，只需在“应用显示过滤器”字段中键入协议名称

Wireshark的图形界面主要分为以下几个部分：

• 工具栏，您可以在其中执行捕获前和捕获后的所有选项
• 主工具栏，您可以在其中找到 Wireshark 中最常用的选项
• 过滤器栏，您可以在其中快速将过滤器应用于当前捕获
• 包列表，显示 Wireshark 捕获的每个包的摘要
• 包的详细信息面板，一旦您在包列表中选择了一个包，就会显示该包的详细信息
• 数据包字节面板，显示所选数据包的字节，并突出显示与数据包详细信息面板中所选字段对应的字节
• 状态栏，显示一些关于 Wireshark 当前状态和捕获的信息

## 计算机代写|python代考|Network traffic in Wireshark

• 时间：抓包的时间
• Source：数据包的来源
• 目的地：数据包到达最终目的地的接收器
• 协议：数据包在其传输过程中遵循的 IP 类型（或规则集），例如 TCP、UDP、SMTP 和 ARP
• 信息：数据包包含
的信息 Wireshark 网站包含您可以导入 Wireshark 的捕获文件示例。您还可以检查它们包含的数据包：https://wiki.wi reshark.org/samplecaptures。

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 计算机代写|python代考|CE9990

Python是一种高级的、解释性的、通用的编程语言。它的设计理念强调代码的可读性，使用大量的缩进。

Python是动态类型的，并且是垃圾收集的。它支持多种编程范式，包括结构化（特别是程序化）、面向对象和函数式编程。由于其全面的标准库，它经常被描述为一种 “包含电池 “的语言。

statistics-lab™ 为您的留学生涯保驾护航 在代写python方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写python代写方面经验极为丰富，各种代写python相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

Now, we are going to create the same script but, instead of using ur11ib or requests, we are going to use the socket module for working at a low level. For this, create a text file called RFc_download_socket.py: The main difference here is that we are using a socket module instead of ur11ib or requests. Socket is Python’s interface for the operating system’s TCP and UDP implementation. We have to tell socket which transport layer protocol we want to use. We do this by using the socket.create_connection() convenience function. This function will always create a TCP connection. For establishing the connection, we are using port 80 , which is the standard port number for web services over HTTP.

Next, we deal with the network communication over the TCP connection. We send the entire request string to the server by using the sendal1() call. The data that’s sent through TCP must be in raw bytes, so we have to encode the request text as ASCII before sending it.

Then, we piece together the server’s response as it arrives in the while loop. Bytes that are sent to us through a TCP socket are presented to our application in a continuous stream. So, like any stream of unknown length, we have to read it iteratively. The recv() call will return the empty string after the server sends all of its data and closes the connection. Finally, we can use this as a condition for breaking out and printing the response.This section will help you update the basics of Wireshark to capture packets, filter them, and inspect them. You can use Wireshark to analyze the network traffic of a suspicious program, analyze the traffic flow in your network, or solve network problems. We will also review the pyshark module for capturing packets in Python.

## 计算机代写|python代考|Introduction to Wireshark

Wireshark is a network packet analysis tool that captures packets in real time and displays them in a graphic interface. Wireshark includes filters, color coding, and other features that allow you to analyze network traffic and inspect packets individually.

Wireshark implements a wide range of filters that facilitate the definition of search criteria for the more than 1,000 protocols it currently supports. All of this happens through a simple and intuitive interface that allows each of the captured packages to be broken down into layers.
Thanks to Wireshark understanding the structure of these protocols, we can visualize the fields of each of the headers and layers that make up the packages, providing a wide range of possibilities to the network administrator when it comes to performing tasks in the analysis of traffic.
One of the advantages that Wireshark has is that at any given moment, we can leave capturing data in a network for as long as we want and then store them so that we can perform the analysis later. It works on several platforms, such as Windows, OS X, Linux, and Unix.
Wireshark is also considered a protocol analyzer or packet sniffer, thus allowing us to observe the messages that are exchanged between applications. For example, if we capture an HTTP message, the packet analyzer must know that this message is encapsulated in a TCP segment, which, in turn, is encapsulated in an IP packet, and which, in turn, is encapsulated in an Ethernet frame.

Wireshark is composed mainly of two elements: a packet capture library, which receives a copy of each data link frame that is either sent or received, and a packet analyzer, which shows the fields corresponding to each of the captured packets. To do this, the packet analyzer must know about the protocols that it is analyzing so that the information that’s shown is consistent.

## 计算机代写|python代考|Introduction to Wireshark

Wireshark 是一种网络数据包分析工具，可以实时捕获数据包并将其显示在图形界面中。Wireshark 包括过滤器、颜色编码和其他功能，可让您分析网络流量并单独检查数据包。

Wireshark 实现了范围广泛的过滤器，这些过滤器有助于为它目前支持的 1,000 多种协议定义搜索条件。所有这一切都通过一个简单直观的界面发生，该界面允许将每个捕获的包分解为多个层。

Wireshark 的优势之一是，在任何给定时刻，我们都可以将捕获的数据留在网络中，想留多久就留多久，然后存储它们，以便稍后执行分析。它适用于多种平台，例如 Windows、OS X、Linux 和 Unix。
Wireshark 也被认为是协议分析器或数据包嗅探器，从而使我们能够观察应用程序之间交换的消息。例如，如果我们捕获一个 HTTP 消息，数据包分析器必须知道这个消息被封装在一个 TCP 段中，TCP 段又被封装在一个 IP 数据包中，然后又被封装在一个以太网帧中。

Wireshark 主要由两个元素组成：一个数据包捕获库，它接收发送或接收的每个数据链路帧的副本，以及一个数据包分析器，它显示与每个捕获的数据包对应的字段。为此，数据包分析器必须了解它正在分析的协议，以便显示的信息是一致的。

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 计算机代写|机器学习代写machine learning代考|COMP4702

statistics-lab™ 为您的留学生涯保驾护航 在代写机器学习 machine learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写机器学习 machine learning代写方面经验极为丰富，各种代写机器学习 machine learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 计算机代写|机器学习代写machine learning代考|Intuition and Main Results

Consider first the training error $E_{\text {train }}$ defined in (5.3). Since
$$\operatorname{tr} \mathbf{Y} \mathbf{Q}^2(\gamma) \mathbf{Y}^{\boldsymbol{\top}}=-\frac{\partial}{\partial \gamma} \operatorname{tr} \mathbf{Y} \mathbf{Q}(\gamma) \mathbf{Y}^{\top},$$
a deterministic equivalent for the resolvent $\mathbf{Q}(\gamma)$ is sufficient to acceess the asymptotic behavior of $E_{\text {train }}$.
With a linear activation $\sigma(t)=t$, the resolvent of interest
$$\mathbf{Q}(\gamma)=\left(\frac{1}{n} \sigma(\mathbf{W X})^{\top} \sigma(\mathbf{W} \mathbf{X})+\gamma \mathbf{I}n\right)^{-1}$$ is the same as in Theorem 2.6. In a sense, the evaluation of $\mathbf{Q}(\gamma)$ (and subsequently $\left.E{\text {train }}\right)$ calls for an extension of Theorem $2.6$ to handle the case of nonlinear activations. Recall now that the main ingredients to derive a deterministic equivalent for (the linear case) $\mathbf{Q}=\left(\mathbf{X}^{\top} \mathbf{W}^{\top} \mathbf{W} \mathbf{X} / n+\gamma \mathbf{I}n\right)^{-1}$ are (i) $\mathbf{X}^{\top} \mathbf{W}^{\top}$ has i.i.d. columns and (ii) its $i$ th column $\left[\mathbf{W}^{\top}\right]_i$ has i.i.d. (or linearly dependent) entries so that the key Lemma $2.11$ applies. These hold, in the linear case, due to the i.i.d. property of the entries of $\mathbf{W}$. However, while for Item (i), the nonlinear $\Sigma^{\top}=\sigma(\mathbf{W X})^{\top}$ still has i.i.d. columns, and for Item (ii), its $i$ th column $\sigma\left(\left[\mathbf{X}^{\top} \mathbf{W}^{\top}\right]{. i}\right)$ no longer has i.i.d. or linearly dependent entries. Therefore, the main technical difficulty here is to obtain a nonlinear version of the trace lemma, Lemma 2.11. That is, we expect that the concentration of quadratic forms around their expectation remains valid despite the application of the entry-wise nonlinear $\sigma$. This naturally falls into the concentration of measure theory discussed in Section $2.7$ and is given by the following lemma.

Lemma 5.1 (Concentration of nonlinear quadratic form, Louart et al. [2018, Lemma 1]). For $\mathbf{w} \sim \mathcal{N}\left(\mathbf{0}, \mathbf{I}_p\right)$, 1-Lipschitz $\sigma(\cdot)$, and $\mathbf{A} \in \mathbb{R}^{n \times n}, \mathbf{X} \in \mathbb{R}^{p \times n}$ such that $|\mathbf{A}| \leq 1$ and $|\mathbf{X}|$ bounded with respect to $p, n$, then,
$$\mathbb{P}\left(\left|\frac{1}{n} \sigma\left(\mathbf{w}^{\top} \mathbf{X}\right) \mathbf{A} \sigma\left(\mathbf{X}^{\top} \mathbf{w}\right)-\frac{1}{n} \operatorname{tr} \mathbf{A} \mathbf{K}\right|>t\right) \leq C e^{-c n \min \left(t, t^2\right)}$$ for some $C, c>0, p / n \in(0, \infty)$ with ${ }^2$
$$\mathbf{K} \equiv \mathbf{K}{\mathbf{X X}} \equiv \mathbb{E}{\mathbf{w} \sim \mathcal{N}\left(\mathbf{0}, \mathbf{I}_p\right)}\left[\sigma\left(\mathbf{X}^{\top} \mathbf{w}\right) \sigma\left(\mathbf{w}^{\boldsymbol{\top}} \mathbf{X}\right)\right] \in \mathbb{R}^{n \times n}$$

## 计算机代写|机器学习代写machine learning代考|Consequences for Learning with Large Neural Networks

To validate the asymptotic analysis in Theorem $5.1$ and Corollary $5.1$ on real-world data, Figures $5.2$ and $5.3$ compare the empirical MSEs with their limiting behavior predicted in Corollary 5.1, for a random network of $N=512$ neurons and various types of Lipschitz and non-Lipschitz activations $\sigma(\cdot)$, respectively. The regressor $\boldsymbol{\beta} \in \mathbb{R}^p$ maps the vectorized images from the Fashion-MNIST dataset (classes 1 and 2) [Xiao et al., 2017] to their corresponding uni-dimensional ( $d=1$ ) output labels $\mathbf{Y}{1 i}, \hat{\mathbf{Y}}{1 j} \in$ ${\pm 1}$. For $n, p, N$ of order a few hundreds (so not very large when compared to typical modern neural network dimensions), a close match between theory and practice is observed for the Lipschitz activations in Figure 5.2. The precision is less accurate but still quite good for the case of non-Lipschitz activations in Figure 5.3, which, we recall, are formally not supported by the theorem statement – here for $\sigma(t)=1-t^2 / 2$, $\sigma(t)=1_{t>0}$, and $\sigma(t)=\operatorname{sign}(t)$. For all activations, the deviation from theory is more acute for small values of regularization $\gamma$.

Figures $5.2$ and $5.3$ confirm that while the training error is a monotonically increasing function of the regularization parameter $\gamma$, there always exists an optimal value for $\gamma$ which minimizes the test error. In particular, the theoretical formulas derived in Corollary $5.1$ allow for a (data-dependent) fast offline tuning of the hyperparameter $\gamma$ of the network, in the setting where $n, p, N$ are not too small and comparable. In terms of activation functions (those listed here), we observe that, on the Fashion-MNIST dataset, the ReLU nonlinearity $\sigma(t)=\max (t, 0)$ is optimal and achieves the minimum test error, while the quadratic activation $\sigma(t)=1-t^2 / 2$ is the worst and produces much higher training and test errors compared to others. This observation will be theoretically explained through a deeper analysis of the corresponding kernel matrix $\mathbf{K}$, as performed in Section 5.1.2. Lastly, although not immediate at first sight, the training and test error curves of $\sigma(t)=1_{t>0}$ and $\sigma(t)=\operatorname{sign}(t)$ are indeed the same, up to a shift in $\gamma$, as a consequence of the fact that $\operatorname{sign}(t)=2 \cdot 1_{t>0}-1$.

# 机器学习代考

## 计算机代写|机器学习代写machine learning代考|Intuition and Main Results

$$\operatorname{tr} \mathbf{Y} \mathbf{Q}^2(\gamma) \mathbf{Y}^{\top}=-\frac{\partial}{\partial \gamma} \operatorname{tr} \mathbf{Y} \mathbf{Q}(\gamma) \mathbf{Y}^{\top}$$

$$\mathbf{Q}(\gamma)=\left(\frac{1}{n} \sigma(\mathbf{W X})^{\top} \sigma(\mathbf{W X})+\gamma \mathbf{I} n\right)^{-1}$$

$\mathbf{Q}=\left(\mathbf{X}^{\top} \mathbf{W}^{\top} \mathbf{W X} / n+\gamma \mathbf{I} n\right)^{-1}$ 是我) $\mathbf{X}^{\top} \mathbf{W}^{\top}$ 有 iid 列和 (ii) 它的 $i$ 第 列 $\left[\mathbf{W}^{\top}\right]_i$ 具有独立同分布 (或线性相关) 条目，因此密钥引理 $2.11$ 适用。在线性情况下，由于条目的 iid 属性，这些成立 W. 然 而，对于项目 (i)，非线性 $\Sigma^{\top}=\sigma(\mathbf{W X})^{\top}$ 仍然有 iid 列，对于项目 (ii)，其 $i$ 第列 $\sigma\left(\left[\mathbf{X}^{\top} \mathbf{W}^{\top}\right] . i\right)$ 不 再具有 iid 或线性相关条目。因此，这里的主要技术难点是获得非线性版本的迹引理，引理 2.11。也就是 说，我们预计尽管应用了逐项非线性 $\sigma$. 这自然落入第 节讨论的测度论的集中 $2.7$ 并由以下引理给出。

$$\mathbb{P}\left(\left|\frac{1}{n} \sigma\left(\mathbf{w}^{\top} \mathbf{X}\right) \mathbf{A} \sigma\left(\mathbf{X}^{\top} \mathbf{w}\right)-\frac{1}{n} \operatorname{tr} \mathbf{A K}\right|>t\right) \leq C e^{-c n \min \left(t, t^2\right)}$$

$$\mathbf{K} \equiv \mathbf{K X X} \equiv \mathbb{E} \mathbf{w} \sim \mathcal{N}\left(\mathbf{0}, \mathbf{I}_p\right)\left[\sigma\left(\mathbf{X}^{\top} \mathbf{w}\right) \sigma\left(\mathbf{w}^{\top} \mathbf{X}\right)\right] \in \mathbb{R}^{n \times n}$$

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 计算机代写|机器学习代写machine learning代考|COMP30027

statistics-lab™ 为您的留学生涯保驾护航 在代写机器学习 machine learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写机器学习 machine learning代写方面经验极为丰富，各种代写机器学习 machine learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 计算机代写|机器学习代写machine learning代考|Random Neural Networks

Although much less popular than modern deep neural networks, neural networks with random fixed weights are simpler to analyze. Such networks have frequently arisen in the past decades as an appropriate solution to handle the possibly restricted number of training data, to reduce the computational and memory complexity and, from another viewpoint, can be seen as efficient random feature extractors. These neural networks in fact find their roots in Rosenblatt’s perceptron [Rosenblatt, 1958] and have then been many times revisited, rediscovered, and analyzed in a number of works, both in their feedforward [Schmidt et al., 1992] and recurrent [Gelenbe, 1993] versions. The simplest modern versions of these random networks are the so-called extreme learning machine [Huang et al., 2012] for the feedforward case, which one may seem as a mere linear regression method on nonlinear random features, and the echo state network [Jaeger, 2001] for the recurrent case. Also see Scardapane and Wang [2017] for a more exhaustive overview of randomness in neural networks.

It is also to be noted that deep neural networks are initialized at random and that random operations (such as random node deletions or voluntarily not-learning a large proportion of randomly initialized neural network weights, that is, random dropout) are common and efficient in neural network learning [Srivastava et al., 2014, Frankle and Carbin, 2019]. We may also point the recent endeavor toward neural network “learning without backpropagation,” which, inspired by biological neural networks (which naturally do not operate backpropagation learning), proposes learning mechanisms with fixed random backward weights and asymmetric forward learning procedures [Lillicrap et al., 2016, Nøkland, 2016, Baldi et al., 2018, Frenkel et al., 2019, Han et al., 2019]. As such, the study of random neural network structures may be instrumental to future improved understanding and designs of advanced neural network structures.

As shall be seen subsequently, the simple models of random neural networks are to a large extent connected to kernel matrices. More specifically, the classification or regression performance at the output of these random neural networks are functionals of random matrices that fall into the wide class of kernel random matrices, yet of a slightly different form than those studied in Section 4. Perhaps more surprisingly, this connection still exists for deep neural networks which are (i) randomly initialized and (ii) then trained with gradient descent, via the so-called neural tangent kernel [Jacot et al., 2018] by considering the “infinitely many neurons” limit, that is, the limit where the network widths of all layers go to infinity simultaneously. This close connection between neural networks and kernels has triggered a renewed interest for the theoretical investigation of deep neural networks from various perspectives including optimization [Du et al., 2019, Chizat et al., 2019], generalization [Allen-Zhu et al., 2019, Arora et al., 2019a, Bietti and Mairal, 2019], and learning dynamics [Lee et al., 2020, Advani et al., 2020, Liao and Couillet, 2018a]. These works shed new light on our theoretical understanding of deep neural network models and specifically demonstrate the significance of studying simple networks with random weights and their associated kernels to assess the intrinsic mechanisms of more elaborate and practical deep networks.

## 计算机代写|机器学习代写machine learning代考|Regression with Random Neural Networks

Throughout this section, we consider a feedforward single-hidden-layer neural network, as illustrated in Figure $5.1$ (displayed, for notational convenience, from right to left). A similar class of single-hidden-layer neural network models, however with a recurrent structure, will be discussed later in Section 5.3.

Given input data $\mathbf{X}=\left[\mathbf{x}_1, \ldots, \mathbf{x}_n\right] \in \mathbb{R}^{p \times n}$, we denote $\Sigma \equiv \sigma(\mathbf{W} \mathbf{X}) \in \mathbb{R}^{N \times n}$ the output of the first layer comprising $N$ neurons. This output arises from the premultiplication of $\mathbf{X}$ by some random weight matrix $\mathbf{W} \in \mathbb{R}^{N \times p}$ with i.i.d. (say standard Gaussian) entries and the entry-wise application of the nonlinear activation function $\sigma: \mathbb{R} \rightarrow \mathbb{R}$. As such, the columns $\sigma\left(\mathbf{W x}_i\right)$ of $\Sigma$ can be seen as random nonlinear features of $\mathbf{x}_i$. The second layer weight $\boldsymbol{\beta} \in \mathbb{R}^{N \times d}$ is then learned to adapt the feature matrix $\Sigma$ to some associated target $\mathbf{Y}=\left[\mathbf{y}_1, \ldots, \mathbf{y}_n\right] \in \mathbb{R}^{d \times n}$, for instance, by minimizing the Frobenius norm $\left|\mathbf{Y}-\boldsymbol{\beta}^{\top} \Sigma\right|_F^2$.

Remark 5.1 (Random neural networks, random feature maps and random kernels). The columns of $\Sigma$ may be seen as the output of the $\mathbb{R}^p \rightarrow \mathbb{R}^N$ random feature map $\phi: \mathbf{x}i \mapsto \sigma\left(\mathbf{W} \mathbf{x}_i\right)$ for some given $\mathbf{W} \in \mathbb{R}^{N \times p}$. In Rahimi and Recht [2008], it is shown that, for every nonnegative definite “shift-invariant” kernel of the form $(\mathbf{x}, \mathbf{y}) \mapsto f\left(|\mathbf{x}-\mathbf{y}|^2\right)$, there exist appropriate choices for $\sigma$ and the law of the entries of $\mathbf{W}$ so that as the number of neurons or random features $N \rightarrow \infty$, $$\sigma\left(\mathbf{W} \mathbf{x}_i\right)^{\top} \sigma\left(\mathbf{W} \mathbf{x}_j\right) \stackrel{\text { a.s. }}{\longrightarrow} f\left(\left|\mathbf{x}_i-\mathbf{x}_j\right|^2\right) .$$ As such, for large enough $N$ (that in general must scale with $n, p$ ), the bivariate function $(\mathbf{x}, \mathbf{y}) \mapsto \sigma(\mathbf{W} \mathbf{x})^{\top} \sigma(\mathbf{W y})$ approximates a kernel function of the type $f\left(|\mathbf{x}-\mathbf{y}|^2\right)$ studied in Chapter 4. This result is then generalized, in subsequent works, to a larger family of kernels including inner-product kernels [Kar and Karnick, 2012], additive homogeneous kernels [Vedaldi and Zisserman, 2012], etc. Another, possibly more marginal, connection with the previous sections is that $\sigma\left(\mathbf{w}^{\top} \mathbf{x}\right)$ can be interpreted as a “properly scaling” inner-product kernel function applied to the “data” pair $\mathbf{w}, \mathbf{x} \in \mathbb{R}^p$. This technically induces another strong relation between the study of kernels and that of neural networks. Again, similar to the concentration of (Euclidean) distance extensively explored in this chapter, the entry-wise convergence in (5.1) does not imply convergence in the operator norm sense, which, as we shall see, leads directly to the so-called “double descent” test curve in random feature/neural network models. If the network output weight matrix $\boldsymbol{\beta}$ is designed to minimize the regularized MSE $L(\boldsymbol{\beta})=\frac{1}{n} \sum{i=1}^n\left|\mathbf{y}_i-\boldsymbol{\beta}^{\top} \sigma\left(\mathbf{W x}_i\right)\right|^2+\gamma|\boldsymbol{\beta}|_F^2$, for some regularization parameter $\gamma>0$, then $\beta$ takes the explicit form of a ridge-regressor ${ }^1$
$$\beta \equiv \frac{1}{n} \Sigma\left(\frac{1}{n} \Sigma^{\top} \Sigma+\gamma \mathbf{I}_n\right)^{-1} \mathbf{Y}^{\top},$$
which follows from differentiating $L(\boldsymbol{\beta})$ with respect to $\boldsymbol{\beta}$ to obtain $0=\gamma \boldsymbol{\beta}+$ $\frac{1}{n} \Sigma\left(\Sigma^{\top} \boldsymbol{\beta}-\mathbf{Y}^{\top}\right)$ so that $\left(\frac{1}{n} \Sigma \Sigma^{\top}+\gamma \mathbf{I}_N\right) \boldsymbol{\beta}=\frac{1}{n} \Sigma \mathbf{Y}^{\top}$ which, along with $\left(\frac{1}{n} \Sigma \Sigma^{\top}+\right.$ $\left.\gamma \mathbf{I}_N\right)^{-1} \Sigma=\Sigma\left(\frac{1}{n} \Sigma^{\top} \Sigma+\gamma \mathbf{I}_n\right)^{-1}$ for $\gamma>0$, gives the result.

# 机器学习代考

## 计算机代写|机器学习代写machine learning代考|Regression with Random Neural Networks

$$\beta \equiv \frac{1}{n} \Sigma\left(\frac{1}{n} \Sigma^{\top} \Sigma+\gamma \mathbf{I}_n\right)^{-1} \mathbf{Y}^{\top},$$

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 计算机代写|机器学习代写machine learning代考|COMP5318

statistics-lab™ 为您的留学生涯保驾护航 在代写机器学习 machine learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写机器学习 machine learning代写方面经验极为丰富，各种代写机器学习 machine learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 计算机代写|机器学习代写machine learning代考|Concluding Remarks

Before the present chapter, the first part of the book was mostly concerned with the sample covariance matrix model $\mathbf{X} \mathbf{X}^{\top} / n$ (and more marginally with the Wigner model $\mathbf{X} / \sqrt{n}$ for symmetric $\mathbf{X}$ ), where the columns of $\mathbf{X}$ are independent and the entries of each column are independent or linearly dependent. Historically, this model and its numerous variations (with a variance profile, with right-side correlation, summed up to other independent matrices of the same form, etc.) have covered most of the mathematical and applied interest of the first two decades (since the early nineties) of intense random matrix advances. The main drivers for these early developments were statistics, signal processing, and wireless communications. The present chapter leaped much further in considering now random matrix models with possibly highly correlated entries, with a specific focus on kernel matrices. When (moderately) largedimensional data are considered, the intuition and theoretical understanding of kernel matrices in small-dimensional setting being no longer accurate, random matrix theory provides accurate (and asymptotically exact) performance assessment along with the possibility to largely improve the performance of kernel-based machine learning methods. This, in effect, creates a small revolution in our understanding of machine learning on realistic large datasets.

A first important finding of the analysis of large-dimensional kernel statistics reported here is the ubiquitous character of the Marčenko-Pastur and the semi-circular laws. As a matter of fact, all random matrix models studied in this chapter, and in particular the kernel regimes $f\left(\mathbf{x}_i^{\top} \mathbf{x}_j / p\right)$ (which concentrate around $f(0)$ ) and $f\left(\mathbf{x}_i^{\top} \mathbf{x}_j / \sqrt{p}\right.$ ) (which tends to $f(\mathcal{N}(0,1))$ ), have a limiting eigenvalue distribution akin to a combination of the two laws. This combination may vary from case to case (compare for instance the results of Practical Lecture 3 to Theorem 4.4), but is often parametrized in a such way that the Marčenko-Pastur and semicircle laws appear as limiting cases (in the context of Practical Lecture 3, they correspond to the limiting cases of dense versus sparse kernels, and in Theorem $4.4$ to the limiting cases of linear versus “purely” nonlinear kernels).

## 计算机代写|机器学习代写machine learning代考|Practical Course Material

In this section, Practical Lecture 3 (that evaluates the spectral behavior of uniformly sparsified kernels) related to the present Chapter 4 is discussed, where we shall see, as for $\alpha-\beta$ and properly scaling kernels in Sections $4.2 .4$ and $4.3$ that, depending on the “level of sparsity,” a combination of Marčenko-Pastur and semicircle laws is observed.
Practical Lecture Material 3 (Complexity-performance trade-off in spectral clustering with sparse kernel, Zarrouk et al. [2020]). In this exercise, we study the spectrum of a “punctured” version $\mathbf{K}=\mathbf{B} \odot\left(\mathbf{X}^{\top} \mathbf{X} / p\right.$ ) (with the Hadamard product $[\mathbf{A} \odot \mathbf{B}]{i j}=[\mathbf{A}]{i j}[\mathbf{B}]{i j}$ of the linear kernel $\mathbf{X}^{\top} \mathbf{X} / p$, with data matrix $\mathbf{X} \in \mathbb{R}^{p \times n}$ and a symmetric random mask-matrix $\mathbf{B} \in{0,1}^{n \times n}$ having independent $[\mathbf{B}]{i j} \sim \operatorname{Bern}(\boldsymbol{\epsilon})$ entries for $i \neq j$ (up to symmetry) and $[\mathbf{B}]_{i i}=b \in{0,1}$ fixed, in the limit $p, n \rightarrow \infty$ with $p / n \rightarrow c \in(0, \infty)$. This matrix mimics the computation of only a proportion $\epsilon \in(0,1)$ of the entries of $\mathbf{X}^{\top} \mathbf{X} / n$, and its impact on spectral clustering. Letting $\mathbf{X}=\left[\mathbf{x}_1, \ldots, \mathbf{x}_n\right]$ with $\mathbf{x}_i$ independently and uniformly drawn from the following symmetric two-class Gaussian mixture
$$\mathcal{C}_1: \mathbf{x}_i \sim \mathcal{N}\left(-\boldsymbol{\mu}, \mathbf{I}_p\right), \quad \mathcal{C}_2: \mathbf{x}_i \sim \mathcal{N}\left(+\boldsymbol{\mu}, \mathbf{I}_p\right)$$
for $\boldsymbol{\mu} \in \mathbb{R}^p$ such that $|\boldsymbol{\mu}|=O(1)$ with respect to $n, p$, we wish to study the effect of a uniform “zeroing out” of the entries of $\mathbf{X}^{\top} \mathbf{X}$ on the presence of an isolated spike in the spectrum of $\mathbf{K}$, and thus on the spectral clustering performance.

We will study the spectrum of $\mathbf{K}$ using Stein’s lemma and the Gaussian method discussed in Section 2.2.2. Let $\mathbf{Z}=\left[\mathbf{z}1, \ldots, \mathbf{z}_n\right] \in \mathbb{R}^{p \times n}$ for $\mathbf{z}_i=\mathbf{x}_i-(-1)^a \boldsymbol{\mu} \sim \mathcal{N}\left(\mathbf{0}, \mathbf{I}_p\right)$ with $\mathbf{x}_i \in \mathcal{C}_a$ and $\mathbf{M}=\mu \mathbf{j}^{\top}$ with $\mathbf{j}=\left[-\mathbf{1}{n / 2}, \mathbf{1}_{n / 2}\right]^{\top} \in \mathbb{R}^n$ so that $\mathbf{X}=\mathbf{M}+\mathbf{Z}$. First show that, for $\mathbf{Q} \equiv \mathbf{Q}(z)=\left(\mathbf{K}-z \mathbf{I}_n\right)^{-1}$,
\begin{aligned} \mathbf{Q}= & -\frac{1}{z} \mathbf{I}_n+\frac{1}{z}\left(\frac{\mathbf{Z}^{\boldsymbol{}} \mathbf{Z}}{p} \odot \mathbf{B}\right) \mathbf{Q}+\frac{1}{z}\left(\frac{\mathbf{Z}^{\boldsymbol{T}} \mathbf{M}}{p} \odot \mathbf{B}\right) \mathbf{Q} \ & +\frac{1}{z}\left(\frac{\mathbf{M}^{\boldsymbol{\top}} \mathbf{Z}}{p} \odot \mathbf{B}\right) \mathbf{Q}+\frac{1}{z}\left(\frac{\mathbf{M}^{\boldsymbol{T}} \mathbf{M}}{p} \odot \mathbf{B}\right) \mathbf{Q} . \end{aligned}
To proceed, we need to go slightly beyond the study of these four terms.

# 机器学习代考

## 计算机代写|机器学习代写machine learning代考|Practical Course Material

$$\mathbf{Q}=-\frac{1}{z} \mathbf{I}_n+\frac{1}{z}\left(\frac{\mathbf{Z Z}}{p} \odot \mathbf{B}\right) \mathbf{Q}+\frac{1}{z}\left(\frac{\mathbf{Z}^T \mathbf{M}}{p} \odot \mathbf{B}\right) \mathbf{Q} \quad+\frac{1}{z}\left(\frac{\mathbf{M}^{\top} \mathbf{Z}}{p} \odot \mathbf{B}\right) \mathbf{Q}+\frac{1}{z}\left(\frac{\mathbf{M}^T}{p}\right.$$

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 计算机代写|机器学习代写machine learning代考|COMP4702

statistics-lab™ 为您的留学生涯保驾护航 在代写机器学习 machine learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写机器学习 machine learning代写方面经验极为丰富，各种代写机器学习 machine learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 计算机代写|机器学习代写machine learning代考|Distance and Inner-Product Random Kernel Matrices

The most widely used kernel model in machine learning applications is the heat kernel $\mathbf{K}=\left{\exp \left(-\left|\mathbf{x}i-\mathbf{x}_j\right|^2 / 2 \sigma^2\right)\right}{i, j=1}^n$, for some $\sigma>0$. It is thus natural to start the large-dimensional analysis of kernel random matrices by focusing on this model.
As mentioned in the previous sections, for the Gaussian mixture model above, as the dimension $p$ increases, $\sigma^2$ needs to scale as $O(p)$, so say $\sigma^2=\tilde{\sigma}^2 p$ for some $\tilde{\sigma}^2=O(1)$, to avoid evaluating the exponential at increasingly large values for $p$ large. As such, the prototypical kernel of present interest is
$$\mathbf{K}=\left{f\left(\frac{1}{p}\left|\mathbf{x}i-\mathbf{x}_j\right|^2\right)\right}{i, j-1}^n,$$
for $f$ a sufficiently smooth function (specifically, $f(t)=\exp \left(-t / 2 \tilde{\sigma}^2\right)$ for the heat kernel). As we will see though, it is much desirable not to restrict ourselves to $f(t)=\exp \left(-t / 2 \tilde{\sigma}^2\right)$ so to better appreciate the impact of the nonlinear kernel function $f$ on the (asymptotic) structural behavior of the kernel matrix $\mathbf{K}$.

## 计算机代写|机器学习代写machine learning代考|Euclidean Random Matrices with Equal Covariances

In order to get a first picture of the large-dimensional behavior of $\mathbf{K}$, let us first develop the distance $\left|\mathbf{x}_i-\mathbf{x}_j\right|^2 / p$ for $\mathbf{x}_i \in \mathcal{C}_a$ and $\mathbf{x}_j \in \mathcal{C}_b$, with $i \neq j$.

For simplicity, let us assume for the moment $\mathbf{C}_1=\cdots=\mathbf{C}_k=\mathbf{I}_p$ and recall the notation $\mathbf{x}_i=\boldsymbol{\mu}_a+\mathbf{z}_i$. We have, for $i \neq j$ that “entry-wise,”
\begin{aligned} \frac{1}{p}\left|\mathbf{x}_i-\mathbf{x}_j\right|^2= & \frac{1}{p}\left|\boldsymbol{\mu}_a-\boldsymbol{\mu}_b\right|^2+\frac{2}{p}\left(\boldsymbol{\mu}_a-\boldsymbol{\mu}_b\right)^{\top}\left(\mathbf{z}_i-\mathbf{z}_j\right) \ & +\frac{1}{p}\left|\mathbf{z}_i\right|^2+\frac{1}{p}\left|\mathbf{z}_j\right|^2-\frac{2}{p} \mathbf{z}_i^{\top} \mathbf{z}_j . \end{aligned}
For $\left|\mathbf{x}_i\right|$ of order $O(\sqrt{p})$, if $\left|\mu_a\right|=O(\sqrt{p})$ for all $a \in{1, \ldots, k}$ (which would be natural), then $\left|\mu_a-\mu_b\right|^2 / p$ is a priori of order $O(1)$ while, by the central limit theorem, $\left|\mathbf{z}_i\right|^2 / p=1+O\left(p^{-1 / 2}\right)$. Also, again by the central limit theorem, $\mathbf{z}_i^{\top} \mathbf{z}_j / p=$ $O\left(p^{-1 / 2}\right)$ and $\left(\mu_a-\mu_b\right)^{\top}\left(\mathbf{z}_i-\mathbf{z}_j\right) / p=O\left(p^{-1 / 2}\right)$

As a consequence, for $p$ large, the distance $\left|\mathbf{x}i-\mathbf{x}_j\right|^2 / p$ is dominated by $| \boldsymbol{\mu}_a-$ $\boldsymbol{\mu}_b |^2 / p+2$ and easily discriminates classes from the pairwise observations of $\mathbf{x}_i, \mathbf{x}_j$, making the classification asymptotically trivial (without having to resort to any kernel method). It is thus of interest consider the situations where the class distances are less significant to understand how the choices of kernel come into play in such more practical scenario. To this end, we now demand that $$\left|\mu_a-\mu_b\right|=O(1),$$ which is also the minimal distance rate that can be discriminated from a mere Bayesian inference analysis, as thoroughly discussed in Section 1.1.3. Since the kernel function $f(\cdot)$ operates only on the distances $\left|\mathbf{x}_i-\mathbf{x}_j\right|$, we may even request (up to centering all data by, say, the constant vector $\frac{1}{n} \sum{a=1}^k n_a \mu_a$ ) for simplicity that $\left|\mu_a\right|=O(1)$ for each $a$.

# 机器学习代考

## 计算机代写|机器学习代写machine learning代考|Euclidean Random Matrices with Equal Covariances

$$\frac{1}{p}\left|\mathbf{x}_i-\mathbf{x}_j\right|^2=\frac{1}{p}\left|\boldsymbol{\mu}_a-\boldsymbol{\mu}_b\right|^2+\frac{2}{p}\left(\boldsymbol{\mu}_a-\boldsymbol{\mu}_b\right)^{\top}\left(\mathbf{z}_i-\mathbf{z}_j\right) \quad+\frac{1}{p}\left|\mathbf{z}_i\right|^2+\frac{1}{p}\left|\mathbf{z}_j\right|^2-\frac{2}{p} \mathbf{z}_i^{\top} \mathbf{z}_j$$

$$\left|\mu_a-\mu_b\right|=O(1)$$

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 计算机代写|机器学习代写machine learning代考|COMP30027

statistics-lab™ 为您的留学生涯保驾护航 在代写机器学习 machine learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写机器学习 machine learning代写方面经验极为丰富，各种代写机器学习 machine learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 计算机代写|机器学习代写machine learning代考|The Nontrivial Growth Rates

In classical large- $n$ only asymptotic statistics, laws of large numbers demand a scaling by $1 / n$ of the summed observations. When centered, central limit theorems then occur after multiplication of the average by $\sqrt{n}$. A similar requirement is needed when we now consider that the dimension $p$ of the data is also large. In particular, we will demand that the norm of each observation remains bounded. Assuming $\mathbf{x} \in \mathbb{R}^p$ is a vector of bounded entries, that is, each of order $O(1)$ with respect to $p$, the natural normalization is typically $\mathbf{x} / \sqrt{p}$.

In the context of kernel methods, for data $\mathbf{x}_1, \ldots, \mathbf{x}_n$, one wishes that the argument of $f(\cdot)$ in the inner-product kernel $f\left(\mathbf{x}_i^{\top} \mathbf{x}_j\right)$ or the distance kernel $f\left(\left|\mathbf{x}_i-\mathbf{x}_j\right|^2\right)$ be of order $O(1)$, when $f$ is assumed independent of $p$.

The “correct” scaling however appears not to be so immediate. Letting $\mathbf{x}i$ have entries of order $O(1)$, one naturally has that $\left|\mathbf{x}_i-\mathbf{x}_j\right|^2=\left|\mathbf{x}_i\right|^2+\left|\mathbf{x}_j\right|^2-2 \mathbf{x}_i^{\top} \mathbf{x}_j=$ $O(p)$ and it thus appears natural to scale $\left|\mathbf{x}_i-\mathbf{x}_j\right|^2$ by $1 / p$. Similarly, if the norm of the mean $\left|\mathbb{E}\left[\mathbf{x}_i\right]\right|$ of $\mathbf{x}_i$ has the same order of magnitude as $\left|\mathbf{x}_i\right|$ itself (as it should in general), then for $\mathbf{x}_i, \mathbf{x}_j$ independent, $\mathbb{E}\left[\mathbf{x}_i^{\top} \mathbf{x}_j\right]=O(p)$. So again, one should scale the inner-product also by $1 / p$, to obtain kernel matrices of the type $$\mathbf{K}=\left{f\left(\frac{1}{p}\left|\mathbf{x}_i-\mathbf{x}_j\right|^2\right)\right}{i, j=1}^n, \text { and }\left{f\left(\frac{1}{p} \mathbf{x}i^{\top} \mathbf{x}_j\right)\right}{i, j=1}^n$$
Section $4.2$ (and most applications thereafter) will be placed under these kernel forms. The most commonly used Gaussian kernel matrix, defined as $\mathbf{K}=\left{\exp \left(-| \mathbf{x}i-\right.\right.$ $\left.\left.\mathbf{x}_j |^2 / 2 \sigma^2\right)\right}{i, j=1}^n$, falls into this family as one usually demands that $\sigma^2 \sim \mathbb{E}\left[\left|\mathbf{x}_i-\mathbf{x}_j\right|^2\right]$ (to avoid evaluating the exponential close to zero or infinity).

However, as already demonstrated in Section 1.1.3, if $n$ scales like $p$, then, for the classification problem to be asymptotically nontrivial, the difference $\left|\mathbb{E}\left[\mathbf{x}_i\right]-\mathbb{E}\left[\mathbf{x}_j\right]\right|^2$ needs to scale like $O(1)$ rather than $O(p)$ (otherwise data classes would be too easy to cluster for all large $n, p$ ), resulting in $\left|\mathbf{x}_i-\mathbf{x}_j\right|^2 / p$ possibly converging to a constant value irrespective of the data classes (of $\mathbf{x}_i$ and $\mathbf{x}_j$ ), with a typical “spread” of order $O(1 / \sqrt{p})$. Similarly, up to re-centering, ${ }^2 \mathbf{x}i^{\top} \mathbf{x}_j / p$ scales like $O(1 / \sqrt{p})$ rather than $O(1)$. As such, it seems more appropriate to normalize the kernel matrix entries as $$[\mathbf{K}]{i j}=f\left(\frac{\left|\mathbf{x}i-\mathbf{x}_j\right|^2}{\sqrt{p}}-\frac{1}{n(n-1)} \sum{i^{\prime}, j^{\prime}} \frac{\left|\mathbf{x}{i^{\prime}}-\mathbf{x}{j^{\prime}}\right|^2}{\sqrt{p}}\right), \text { or }[\mathbf{K}]_{i j}=f\left(\frac{1}{\sqrt{p}} \mathbf{x}_i^{\top} \mathbf{x}_j\right)$$
in order here to avoid evaluating $f$ essentially at a single value (equal to zero for the inner-product kernel or equal to the average “common” limiting intra-data distance for the distance kernel).

This “properly scaling” setting is in fact much richer than the $1 / p$ normalization when $n, p$ are of the same order of magnitude. Sections $4.2 .4$ and $4.3$ elaborate on this scenario.

## 计算机代写|机器学习代写machine learning代考|Statistical Data Model

In the remainder of the section, we assume the observation of $n$ independent data vectors from a total of $k$ classes gathered as $\mathbf{X}=\left[\mathbf{x}1, \ldots, \mathbf{x}_n\right] \in \mathbb{R}^{p \times n}$, where $$\begin{array}{cc} \mathbf{x}_1, \ldots, \mathbf{x}{n_1} & \sim \mathcal{N}\left(\mu_1, \mathbf{C}1\right) \ \vdots & \vdots \ \mathbf{x}{n-n_k+1}, \ldots, \mathbf{x}n \sim \mathcal{N}\left(\mu_k, \mathbf{C}_k\right), \end{array}$$ which is a $k$-class Gaussian mixture model (GMM) with a fixed cardinality $n_1, \ldots, n_k$ in each class. ${ }^3$ The fact that the data are indexed according to classes simplifies the notation but has no practical consequence in the analysis. We will denote $\mathcal{C}_a$ the class number ” $a$,” so in particular $$\mathbf{x}_i \sim \mathcal{N}\left(\mu_a, \mathbf{C}_a\right) \Leftrightarrow \mathbf{x}_i \in \mathcal{C}_a$$ for $a \in{1, \ldots, k}$, and will use for convenience the matrix $$\mathbf{J}=\left[\mathbf{j}_1, \ldots, \mathbf{j}_k\right] \in \mathbb{R}^{n \times k}, \quad \mathbf{j}_a=[\underbrace{0, \ldots, 0}{n_1+\ldots+n_{a-1}}, \underbrace{1, \ldots, 1}{n_a}, \underbrace{0, \ldots, 0}{n_{a+1}+\ldots+n_k}]^{\top},$$
which is the indicator matrix of the class labels $(\mathbf{J}$ is a priori known under a supervised learning setting and is to be fully or partially recovered under a semi-supervised or unsupervised learning setting).

We shall systematically make the following simplifying growth rate assumption for $p, n$ and $n_1, \ldots, n_k$.

Assumption 1 (Growth rate of data size and number). As $n \rightarrow \infty, p / n \rightarrow c \in(0, \infty)$ and $n_a / n \rightarrow c_a \in(0,1)$.

This assumption, in particular, implies that each class is “large” in the sense that their cardinalities increase with $n^4$

Accordingly with the discussions in Chapter 2, from a random matrix “universality” perspective, the Gaussian mixture assumption will often (yet not always) turn out equivalent to demanding that
$$\mathbf{x}_i \in \mathcal{C}_a: \mathbf{x}_i=\mu_a+\mathbf{C}_a^{\frac{1}{2}} \mathbf{z}_i$$
with $\mathbf{z}_i \in \mathbb{R}^p$ a random vector with i.i.d. entries of zero mean, unit variance, and bounded higher-order (e.g., fourth) moments.

This hypothesis is indeed quite restrictive as it imposes that the data, up to centering and linear scaling, are composed of i.i.d. entries. Equivalently, this suggests that only data which result from affine transformations of vectors with i.i.d. entries can be studied, which is quite restrictive in practice as “real data” are deemed much more complex.

Exploring the notion of concentrated random vectors introduced in Section 2.7, Chapter 8 will open up this discussion by showing that a much larger class of (statistical) data models embrace the same asymptotic statistics, and that most results discussed in the present section apply identically to broader models of data irreducible to vectors of independent entries.

# 机器学习代考

## 计算机代写|机器学习代写machine learning代考|Statistical Data Model

$$\mathbf{x}_i \in \mathcal{C}_a: \mathbf{x}_i=\mu_a+\mathbf{C}_a^{\frac{1}{2}} \mathbf{z}_i$$

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 计算机代写|机器学习代写machine learning代考|COMP5318

statistics-lab™ 为您的留学生涯保驾护航 在代写机器学习 machine learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写机器学习 machine learning代写方面经验极为丰富，各种代写机器学习 machine learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 计算机代写|机器学习代写machine learning代考|Kernel Methods

In a broad sense, kernel methods are at the core of many, if not most, machine learning algorithms [Schölkopf and Smola, 2018]. Given a set of data $\mathbf{x}1, \ldots, \mathbf{x}_n \in \mathbb{R}^p$, most learning mechanisms rely on extracting the structural data information from direct or indirect pairwise comparisons $\kappa\left(\mathbf{x}_i, \mathbf{x}_j\right)$ for some affinity metric $\kappa(\cdot, \cdot)$. Gathered in an $n \times n$ matrix $$\mathbf{K}=\left{\kappa\left(\mathbf{x}_i, \mathbf{x}_j\right)\right}{i, j=1}^n$$
the “cumulative” effect of these comparisons for numerous $(n \gg 1)$ data is at the source of various supervised, semi-supervised, or unsupervised methods such as support vector machines, graph Laplacian-based learning, kernel spectral clustering, and has deep connections to neural networks.

These applications will be thoroughly discussed in Section 4.4. For the moment though, our main interest lies in the spectral characterization of the kernel matrix $\mathbf{K}$ itself for various (classical) choices of affinity functions $\kappa$ and for various statistical models of the data $\mathbf{x}_i$

Clearly, from a purely machine learning perspective, the choice of the affinity function $\kappa(\cdot, \cdot)$ is central to a good performance of the learning method under study. Since real data in general have highly complex structures, a typical viewpoint is to assume that the data points $\mathbf{x}_i$ and $\mathbf{x}_j$ are not directly comparable in their ambient space but that there exists a convenient feature extraction function $\phi: \mathbb{R}^p \rightarrow \mathbb{R}^q(q \in \mathbb{N} \cup{+\infty})$ such that $\phi\left(\mathbf{x}_i\right)$ and $\phi\left(\mathbf{x}_j\right)$ are more amenable to comparison. Otherwise stated, in the image of $\phi(\cdot)$, the data are more “linear” (or more “linearly separable” if one seeks to group the data in affinity classes). The simplest affinity function between $\mathbf{x}_i$ and $\mathbf{x}_j$ would in this case be $\kappa\left(\mathbf{x}_i, \mathbf{x}_j\right)=\phi\left(\mathbf{x}_i\right)^{\top} \phi\left(\mathbf{x}_j\right)$

Since $q$ may be larger (if not much larger) than $p$, the mere cost of evaluating $\phi\left(\mathbf{x}_i\right)^{\top} \phi\left(\mathbf{x}_j\right)$ can be deleterious to practical implementation. The so-called kernel trick is anchored in the remark that, for a certain class of such functions $\phi, \phi\left(\mathbf{x}_i\right)^{\top} \phi\left(\mathbf{x}_j\right)=$ $f\left(\left|\mathbf{x}_i-\mathbf{x}_j\right|^2\right)$ or $-f\left(\mathbf{x}_i^{\top} \mathbf{x}_j\right)$ for some function $f: \mathbb{R} \rightarrow \mathbb{R}$ and it thus suffices to evaluate $\left|\mathbf{x}_i-\mathbf{x}_j\right|^2$ or $\mathbf{x}_i^{\top} \mathbf{x}_j$ in the ambient space and then apply $f$ in an entrywise manner to evaluate all data affinities, leading to more practically convenient methods.

Although the class of such functions $f$ is inherently restricted by the need for a mapping $\phi$ to exist such that, say, $\phi\left(\mathbf{x}_i\right)^{\top} \phi\left(\mathbf{x}_j\right)=f\left(\left|\mathbf{x}_i-\mathbf{x}_j\right|^2\right)$ for all possible $\mathbf{x}_i, \mathbf{x}_j$ pairs (these are sometimes called Mercer kernel functions), ${ }^1$ with time, practitioners have started to use arbitrary functions $f$ and worked with generic kernel matrices of the form
$$\mathbf{K}=\left{f\left(\left|\mathbf{x}i-\mathbf{x}_j\right|^2\right)\right}{i, j=1}^n, \quad \text { or } \quad \mathbf{K}=\left{f\left(\mathbf{x}i^{\top} \mathbf{x}_j\right)\right}{i, j=1}^n,$$
irrespective of the actual form or even the existence of an underlying feature extraction function $\phi$. There are, in particular, empirical evidences showing that well-chosen “indefinite” (i.e., nonMercer type) kernels, being not associated with a mapping $\phi$, can sometimes outperform conventional nonnegative definite kernels that satisfy the Mercer’s condition [Haasdonk, 2005, Luss and D’Aspremont, 2008].

## 计算机代写|机器学习代写machine learning代考|Basic Setting

As pointed out in Remark $4.1$ and shall become evident from the coming analysis, the small-dimensional intuition according to which $f$ should be a nonincreasing “valid” Mercer function becomes rather meaningless when dealing with large-dimensional data, essentially due to the “curse of dimensionality” and the concentration phenomenon in high dimensions.

To fully capture this aspect, a first important consideration is, as already mentioned in Section 1.1.3, to deal with “nontrivial” relative growth rates of the statistical data parameters with respect to the dimensions $p, n$. By nontrivial, we mean that the underlying classification or regression problem for which the kernel method is designed should neither be impossible nor trivially easy to solve as $p, n \rightarrow \infty$. The reason behind this request is fundamental, and also disrupts from many research works in machine learning which, instead, seek to prove that the method under study performs perfectly in the limit of large $n$ (with $p$ fixed in general): Here, we rather wish to account for the fact that, at finite but large $p, n$, the machine learning methods of practical interest are those which have nontrivial performances; thus, in what follows, ” $n, p \rightarrow \infty$ in nontrivial growth rates” should really be understood as ” $n, p$ are both large and the problem at hand is non-trivially easy or hard to solve.”

In this section, we will mostly focus on the use of kernel methods for classification, and thus the nontrivial settings are given in terms of the growth rate of the “distance” between (the statistics of) data classes. It will particularly appear that the very definition of the appropriate growth rates to ensure the nontrivial character of a machine learning problem to be solved through kernel methods depends on the kernel design itself, and that flagship kernels such as the Gaussian kernel $\kappa\left(\mathbf{x}_i, \mathbf{x}_j\right)=\exp \left(-\left|\mathbf{x}_i-\mathbf{x}_j\right|^2 / 2 \sigma^2\right)$ are in general quite suboptimal.

# 机器学习代考

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 计算机代写|深度学习代写deep learning代考|STAT3007

statistics-lab™ 为您的留学生涯保驾护航 在代写深度学习deep learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写深度学习deep learning代写方面经验极为丰富，各种代写深度学习deep learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 计算机代写|深度学习代写deep learning代考|Some Background on Darwin and Evolution

Charles Darwin formed his initial concepts and theory of natural selection based on his voyages around the continent of South America. From Darwin’s work, our thirst for understanding evolution drove our exploration into how life on earth shares and passes on selective traits using genetics.

Taking 2 decades to write in 1859 , Darwin published his most famous work “On the Origin of Species” a seminal work that uprooted the natural sciences. His work challenged the idea of an intelligent creator and formed the basis for much of our natural and biological sciences to this day. The following excerpt is from that book and describes the theory of natural selection in Darwin’s words:

“One general law, leading to the advancement of all organic beings, namely, multiply, vary, let the strongest live and the weakest die.”
Charles Darwin – On the Origin of Species
From this law Darwin constructed his theory of evolution and the need for life to survive by passing on more successful traits to offspring. While he didn’t understand the process of cellular mitosis and genetics, he did observe the selective passing of traits in multiple species. It wasn’t until 1865 that a German monk named Gregor Mendel would outline his theories of gene inheritance by observing 7 traits in pea plants.

Mendel used the term factors or traits to describe what we now understand as genes. It took almost another 3 decades before his work was recognized and the field of genetics was born. Since then, our understanding of genetics has grown from gene therapy and hacking to solving complex problems and evolving code.

## 计算机代写|深度学习代写deep learning代考|Applying Crossover – Reproduction

After the parents are selected, we can move on to applying crossover or essentially the reproduction process of creating offspring. Not unlike the cellular division process in biology, we simulate the combining of chromosomes through a crossover operation. Where each parent shares a slice of its gene sequence and combines it with the other parents.

Figure $2.9$ shows the crossover operation being applied using 2 parents. In crossover, a point is selected either randomly or using some strategy along the gene sequence. It is at this point the gene sequences of the parents are split and then recombined. In this simple example, we don’t care about what percentage of the gene sequence is shared with each offspring.

For more complex problems requiring thousands or millions of generations we may prefer more balanced crossover strategies rather than this random selection method. We will further cover the strategies we can use to define this operation later in the chapter.

In code the crossover operation first makes a copy of themselves to create the raw children. Then we randomly determine if there is a crossover operation using the variable crossover_rate. If there is a crossover operation then a random point along the gene sequence is generated as the crossover point. This point is used to split the gene sequence and then the children are generated by combining the gene sequences of both parents.

There are several variations and ways in which crossover may be applied to the gene sequence. For this example, selecting a random crossover point and then simply combining the sequences at the split point works. However, in some cases, particular gene sequences may or may not make sense in which case we may need other methods to preserve gene sequences.

## 计算机代写|深度学习代写deep learning代考|Some Background on Darwin and Evolution

1859 年，达尔文花了 2 年的时间写作，发表了他最著名的著作《物种起源》，这是一部颠覆自然科学的开创性著作。他的工作挑战了智能创造者的想法，并构成了我们今天大部分自然科学和生物科学的基础。以下摘自那本书，用达尔文的话描述了自然选择理论：

“一个普遍的规律，导致所有有机生物的进步，即繁殖，变异，让最强者生存，让最弱者死亡。”

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。