## 计算机代写|流形学习代写Manifold learning代考|Math214

statistics-lab™ 为您的留学生涯保驾护航 在代写流形学习Manifold learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写流形学习Manifold learning代写方面经验极为丰富，各种代写流形学习Manifold learning相关的作业也就用不着说。

## 计算机代写|流形学习代写Manifold learning代考|Diffusion Maps

The basic idea of Diffusion MAPs (Nadler, Lafon, Coifman, and Kevrekidis, 2005; Coifman and Lafon, 2006) uses a Markov chain constructed over a graph of the data points, followed by an eigenanalysis of the probability transition matrix of the Markov chain. As with the other algorithms in this Section, there are three steps in this algorithm, with the first and second steps the same as for Laplacian eigenmaps. Although a nearest-neighbor search (Step 1) was not explicitly considered in the above papers on diffusion maps as a means of constructing the graph (Step 2), a nearest-neighbor search is included in software packages for computing diffusion maps. For an example in astronomy of a diffusion map incorporating a nearest-neighbor search, see Freeman, Newman, Lee, Richards, and Schafer (2009).

1. Nearest-Neighbor Search. Fix an integer $K$ or an $\epsilon>0$. Define a $K$-neighborhood $N_i^K$ or an $\epsilon$-neighborhood $N_i^\epsilon$ of the point $\mathbf{x}_i$ as in Step 1 of Laplacian eigenmaps. In general, let $N_i$ denote the neighborhood of $\mathbf{x}_i$.
1. Pairwise Adjacency Matrix. The $n$ data points $\left{\mathbf{x}i\right}$ in $\Re^r$ can be regarded as a graph $\mathcal{G}=\mathcal{G}(\mathcal{V}, \mathcal{E})$ with the data points playing the role of vertices $\mathcal{V}=\left{\mathbf{x}_1, \ldots, \mathbf{x}_n\right}$, and the set of edges $\mathcal{E}$ are the connection strengths (or weights), $w\left(\mathbf{x}_i, \mathbf{x}_j\right)$, between pairs of adjacent vertices, $$w{i j}=w\left(\mathbf{x}i, \mathbf{x}_j\right)= \begin{cases}\exp \left{-\frac{\left|\mathbf{x}_i-\mathbf{x}_j\right|^2}{2 \sigma^2}\right}, & \text { if } \mathbf{x}_j \in N_i ; \ 0, & \text { otherwise. }\end{cases}$$ This is a Gaussian kernel with width $\sigma$; however, other kernels may be used. Kernels such as (1.52) ensure that the closer two points are to each other, the larger the value of $w$. For convenience in exposition, we will suppress the fact that the elements of most of the matrices depend upon the value of $\sigma$. Then, $\mathbf{W}=\left(w{i j}\right)$ is a pairwise adjacency matrix between the $n$ points. To make the matrix $\mathbf{W}$ even more sparse, values of its entries that are smaller than some given threshold (i.e., the points in question are far apart from each other) can be set to zero. The graph $\mathcal{G}$ with weight matrix $\mathbf{W}$ gives information on the local geometry of the data.
2. Spectral embedding. Define $\mathbf{D}=\left(d_{i j}\right)$ to be a diagonal matrix formed from the matrix $\mathbf{W}$ by setting the diagonal elements, $d_{i i}=\sum_j w_{i j}$, to be the column sums of $\mathbf{W}$ and the off-diagonal elements to be zero. The $(n \times n)$ symmetric matrix $\mathbf{L}=\mathbf{D}-\mathbf{W}$ is the graph Laplacian for the graph $\mathcal{G}$. We are interested in the solutions of the generalized eigenequation, $\mathbf{L v}=\lambda \mathbf{D v}$, or, equivalently, of the matrix
$$\mathbf{P}=\mathbf{D}^{-1 / 2} \mathbf{L} \mathbf{D}^{-1 / 2}=\mathbf{I}_n-\mathbf{D}^{-1 / 2} \mathbf{W} \mathbf{D}^{-1 / 2},$$
which is the normalized graph Laplacian. The matrix $\mathbf{H}=e^{t \mathbf{P}}, t \geq 0$, is usually referred to as the heat kernel. By construction, $\mathbf{P}$ is a stochastic matrix with all row sums equal to one, and, thus, can be interpreted as defining a random walk on the graph $\mathcal{G}$.

## 计算机代写|流形学习代写Manifold learning代考|Hessian Eigenmaps

Recall that, in certain situations, the convexity assumption for IsomAP may be too restrictive. Instead, we may require that the manifold $\mathcal{M}$ be locally isometric to an open, connected subset of $\Re^t$. Popular examples include families of “articulated” images (i.e., translated or rotated images of the same object, possibly through time) that are found in a high-dimensional, digitized-image library (e.g., faces, pictures, handwritten numbers or letters). However, if the pixel elements of each 64-pixel-by-64-pixel digitized image are represented as a 4,096-dimensional vector in “pixel space,” it would be very difficult to show that the images really live on a low-dimensional manifold, especially if that image manifold is unknown.

We can model such images using a vector of smoothly varying articulation parameters $\boldsymbol{\theta} \in \Theta$. For example, digitized images of a person’s face that are varied by pose and illumination can be parameterized by two pose parameters (expression [happy, sad, sleepy, surprised, wink] and glasses-no glasses) and a lighting direction (centerlight, leftlight, rightlight, normal); similarly, handwritten “2”s appear to be parameterized essentially by two features, bottom loop and top arch (Tenenbaum, de Silva, and Langford, 2000; Roweis and Saul, 2000). To some extent, learning about an underlying image manifold depends upon whether the images are sufficiently scattered around the manifold and how good is the quality of digitization of each image?

Hessian Eigenmaps (Donoho and Grimes, 2003b) were proposed for recovering manifolds of high-dimensional libraries of articulated images where the convexity assumption is often violated. Let $\Theta \subset \Re^t$ be the parameter space and suppose that $\phi: \Theta \rightarrow \Re^r$, where $t<r$. Assume $\mathcal{M}=\phi(\Theta)$ is a smooth manifold of articulated images. The isometry and convexity requirements of IsOMAP are replaced by the following weaker requirements:

• Local Isometry: $\phi$ is a locally isometric embedding of $\Theta$ into $\Re^r$. For any point $\mathbf{x}^{\prime}$ in a sufficiently small neighborhood around each point $\mathrm{x}$ on the manifold $\mathcal{M}$, the geodesic distance equals the Euclidean distance between their corresponding parameter points $\boldsymbol{\theta}, \boldsymbol{\theta}^{\prime} \in \Theta ;$ that is,
$$d^{\mathcal{M}}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)=\left|\boldsymbol{\theta}-\boldsymbol{\theta}^{\prime}\right|_{\Theta},$$
where $\mathbf{x}=\phi(\boldsymbol{\theta})$ and $\mathbf{x}^{\prime}=\phi\left(\boldsymbol{\theta}^{\prime}\right)$.
• Connectedness: The parameter space $\Theta$ is an open, connected subset of $\Re^t$.

# 流形学习代写

## 计算机代写|流形学习代写Manifold learning代考|Diffusion Maps

$$\mathbf{P}=\mathbf{D}^{-1 / 2} \mathbf{L} \mathbf{D}^{-1 / 2}=\mathbf{I}_n-\mathbf{D}^{-1 / 2} \mathbf{W} \mathbf{D}^{-1 / 2},$$

## 计算机代写|流形学习代写Manifold learning代考|Hessian Eigenmaps

Hessian Eigenmaps (Donoho and Grimes, 2003b)被提出用于恢复经常违反凹凸性假设的高维铰接图像库的流形。设$\Theta \subset \Re^t$为参数空间，假设$\phi: \Theta \rightarrow \Re^r$，其中$t<r$。假设$\mathcal{M}=\phi(\Theta)$是一个平滑的铰接图像集合。IsOMAP的等距和凹凸性要求被以下较弱的要求所取代:

$$d^{\mathcal{M}}\left(\mathbf{x}, \mathbf{x}^{\prime}\right)=\left|\boldsymbol{\theta}-\boldsymbol{\theta}^{\prime}\right|_{\Theta},$$

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 机器学习代写|流形学习代写manifold data learning代考|ICML2022

statistics-lab™ 为您的留学生涯保驾护航 在代写流形学习manifold data learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写流形学习manifold data learning代写方面经验极为丰富，各种代写流形学习manifold data learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 机器学习代写|流形学习代写manifold data learning代考|History of Probabilistic Dimensionality Reduction

The probabilistic variants of many of the spectral dimensionality reduction methods were gradually developed and proposed. For example, probabilistic $P C A[50,62]$ is the stochastic version of PCA, and it assumes that data are obtained by the addition of noise to the linear projection of a latent random variable. It can be demonstrated that PCA is a special case of probabilistic PCA, where the variance of noise tends to zero. The probabilistic PCA, itself, is a special case of factor analysis [17]. In the factor analysis, the different dimensions of noise can be correlated, while they are uncorrelated and isometric in the probabilistic PCA. The factor analysis and the probabilistic PCA use expectation maximization and variational inference on the data.

The linear spectral dimensionality reduction methods, such as PCA and FDA, learn a projection matrix from the data for better data representation or more separation between classes. In 1984, it was surprisingly determined that if a random matrix is used for the linear projection of the data, without being learned from the data, it represents data well! The correctness of this mystery of random projection was proven by the Johnson-Lindenstrauss lemma [33], which put a bound on the error of preservation of the distances in the subspace. Later, nonlinear variants of random projection were developed, including random Fourier features [48] and random kitchen sinks [49].

Sufficient Dimension Reduction $(S D R)$ is another family of probabilistic methods, whose first method was proposed in 1991 [39]. It is used for finding a transformation of data to a lower-dimensional space, which does not change the conditional of labels given data. Therefore, the subspace is sufficient for predicting labels from projected data onto the subspace. SDR was mainly proposed for high-dimensional regression, where the regression labels are used. Later, Kernel Dimensionality Reduction $(K D R)$ was proposed [18] as a method in the family of SDR for dimensionality reduction in machine learning.

Stochastic Neighbour Embedding (SNE) was proposed in 2003 [30] and took a probabilistic approach to dimensionality reduction. It attempted to preserve the probability of a point being a neighbour of others in the subspace. A problem with SNE was that it could not find an optimal subspace because it was not possible for it to preserve all the important information of high-dimensional data in the low-dimensional subspace. Therefore, t-SNE was proposed [41], which used another distribution with more capacity in the subspace. This allowed t-SNE to preserve a larger amount of information in the low-dimensional subspace. A recent successful probabilistic dimensionality reduction method is the Uniform Manifold Approximation and Projection (UMAP) [43], which is widely used for data visualization. Today, both t-SNE and UMAP are used for high-dimensional data visualization, especially in the visualization of extracted features in deep learning. They have also been widely used for visualizing high-dimensional genome data.

## 机器学习代写|流形学习代写manifold data learning代考|History of Neural Network-Based Dimensionality

Neural networks are machine learning models modeled after the neural structure of the human brain. Neural networks are currently powerful tools for representation learning and dimensionality reduction. In the 1990s, researchers’ interest in neural networks decreased; this was called the winter of neural networks. This winter occurred mainly because networks could not become deep, as gradients vanished after many layers of network during optimization. The success of kernel support vector machines [10] also exaggerated this winter. In 2006, Hinton and Salakhutdinov demonstrated that a network’s weights can be initialized using energy-based training, where the layers of the network are considered stacks of Restricted Boltzmann Machines (RBM) [1,31]. RBM is a two-layer structure of neurons, whose weights between the two layers are trained using maximum likelihood estimation [68]. This initialization saved the neural network from the vanishing gradient problem and ended the neural networks’ winter. A deep network using RBM training was named the deep belief network [29]. Although, later, the proposal of the ReLU activation function [23] and the dropout technique [59] made it possible to train deep neural networks with random initial weights [24].

In fundamental machine learning, people often extract features using traditional dimensionality reduction and then apply the classification, regression, or clustering task afterwards. However, modern deep learning extracts features and learns embedding spaces in the layers of the network; this process is called end-to-end. Therefore, deep learning can be seen as performing a form of dimensionality reduction as part of its model. One problem with end-to-end models is that they are harder to troubleshoot if the performance is not satisfactory on a part of the data. The insights and meaning of the data coming from representation learning are critical to fully understand a model’s performance. Some of these insights can be useful for improving or understanding how deep neural networks operate. Researchers often visualize the extracted features of a neural network to interpret and analyze why deep learning is working properly on their data.

Deep metric learning [35] utilizes deep neural networks for extracting lowdimensional descriptive features from data at the last or one-to-last layer of the network. Siamese networks [11] are important network structures for deep metric learning. They contain several identical networks that share their weights, but have different inputs. Contrastive loss [27] and triplet loss [56] are two well-known loss functions that were proposed for training Siamese networks. Deep reconstruction autoencoders also make it possible to capture informative features at the bottleneck between the encoder and decoder.

## 机器学习代写|流形学习代写manifold data learning代考|History of Probabilistic Dimensionality Reduction

PCA 和 FDA 等线性光谱降维方法从数据中学习投影矩阵，以实现更好的数据表示或更好的类间分离。1984年，令人惊奇地确定，如果用一个随机矩阵来做数据的线性投影，不用从数据中学习，就可以很好地表示数据！Johnson-Lindenstrauss 引理 [33] 证明了这种随机投影之谜的正确性，该引理限制了子空间中距离保存的误差。后来，开发了随机投影的非线性变体，包括随机傅立叶特征 [48] 和随机厨房水槽 [49]。

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 机器学习代写|流形学习代写manifold data learning代考|SCl7314

statistics-lab™ 为您的留学生涯保驾护航 在代写流形学习manifold data learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写流形学习manifold data learning代写方面经验极为丰富，各种代写流形学习manifold data learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 机器学习代写|流形学习代写manifold data learning代考|Dimensionality Reduction and Manifold Learning

Feature extraction is also referred to as dimensionality reduction, manifold learning [12], subspace learning, submanifold learning, manifold unfolding, embedding, encoding, and representation learning [7, 70]. This book uses manifold learning and dimensionality reduction interchangeably for feature extraction. Manifold learning techniques can be used in a variety of ways, including:

• Data dimensionality reduction: Produce a compact (compressed) lowdimensional encoding of a given high-dimensional dataset.
• Data visualization: Provide an interpretation of a given dataset in terms of intrinsic degrees of freedom, usually as a byproduct of data dimensionality reduction.
• Preprocessing for supervised learning: Simplify, reduce, and clean the data for subsequent supervised training.

In dimensionality reduction, the data points are mapped to a lower-dimensional subspace either linearly or nonlinearly. Dimensionality reduction methods can be grouped into three categories-spectral dimensionality reduction, probabilistic dimensionality reduction, and (artificial) neural network-based dimensionality reduction [19] (see Fig.1.3). These categories are introduced in the following subsections.

Consider several points in a three-dimensional Euclidean space. Assume these points lie on a nonlinear submanifold with a local dimensionality of two, as illustrated in Fig. 1.4a. This means that there is no need to have three features to represent each of these points. Rather, two features can represent most of the data points’ information if the two features demonstrate the 2D coordinates on the submanifold. A nonlinear dimensionality reduction method can unfold this manifold correctly, as depicted in Fig. 1.4b. However, a linear dimensionality reduction method cannot properly find a correct underlying $2 \mathrm{D}$ representation of the data points. Figure $1.4 \mathrm{c}$ demonstrates that a linear method ruins the relative structure of the data points. This is because a linear method uses the Euclidean distances between the points, while a nonlinear method considers the geodesic distances along the nonlinear submanifold. If the submanifold is linear, the linear method is able to obtain the lower-dimensional structure of the data. Spectral dimensionality reduction methods typically have a geometric perspective and attempt to find the linear or nonlinear submanifold of the data. These methods are often reduced to a generalized eigenvalue problem [20].

## 机器学习代写|流形学习代写manifold data learning代考|History of Spectral Dimensionality Reduction

Principal Component Analysis (PCA) [34] was first proposed by Pearson in 1901 [47]. It was the first spectral dimensionality reduction method and one of the first methods in linear subspace learning. It is unsupervised, meaning that it does not use any class labels. Fisher Discriminant Analysis (FDA) [16], proposed by Fisher in 1936 [15], was the first supervised spectral dimensionality reduction method. PCA and FDA are based on the scatter, i.e., variance of the data. A proper subspace preserves either the relative similarity or relative dissimilarity of the data points after transformation of data from the input space to the subspace. This was the goal of Multidimensional Scaling (MDS) [13], which preserves the relative similarities of data points in its subspace. In later MDS approaches, the cost function was changed to preserve the distances between points [37], which developed into Sammon mapping [52]. Sammon mapping is considered to be the first nonlinear dimensionality reduction method.

Figure $1.4$ demonstrates that a linear algorithm cannot perform well on nonlinear data. For nonlinear data, two approaches can be used:

• a nonlinear algorithm should be designed to handle nonlinear data, or
• the nonlinear data should be modified to become linear. In this case, the data should be transformed to another space to become linearly separable in that space. Then, the transformed data, which now have a linear pattern, will be able to use a linear approach. This approach is called kernelization in machine learning.
The kernel PCA $[54,55]$ uses the PCA and the kernel trick [32] to transform data to a high-dimensional space so that it becomes roughly linear within that space. Kernel FDA $[44,45]$ was also proposed to manipulate nonlinear data in a supervised manner using representation theory [3]. Representation theory can be used for kernelization; it will be introduced in Chap. 3 .

## 机器学习代写|流形学习代写manifold data learning代考|Dimensionality Reduction and Manifold Learning

• 数据降维：为给定的高维数据集生成紧凑（压缩）的低维编码。
• 数据可视化：根据内在自由度提供给定数据集的解释，通常作为数据降维的副产品。
• Preprocessing for supervised learning：简化、减少和清洗数据，用于后续的监督训练。

## 机器学习代写|流形学习代写manifold data learning代考|History of Spectral Dimensionality Reduction

• 应该设计一个非线性算法来处理非线性数据，或者
• 非线性数据应修改为线性。在这种情况下，应该将数据转换到另一个空间，使其在该空间中线性可分。然后，现在具有线性模式的转换数据将能够使用线性方法。这种方法在机器学习中称为内核化。
内核PCA[54,55]使用 PCA 和内核技巧 [32] 将数据转换到高维空间，使其在该空间内大致呈线性。内核FDA[44,45]还提出了使用表示论 [3] 以监督方式操纵非线性数据。表示论可用于核化；将在第 1 章介绍。3.

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 机器学习代写|流形学习代写manifold data learning代考|EECS559

statistics-lab™ 为您的留学生涯保驾护航 在代写流形学习manifold data learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写流形学习manifold data learning代写方面经验极为丰富，各种代写流形学习manifold data learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 机器学习代写|流形学习代写manifold data learning代考|Manifold Hypothesis

Each feature of a data point does not carry an equal amount of information. For example, some pixels of an image are background regions with limited information, while other pixels contain important objects that describe the scene in the image. This means that data points can be significantly compressed to preserve the most informative features while eliminating those with limited information. In other words, the $d$-dimensional data points of a dataset usually do not cover the entire $d$ dimensional Euclidean space, but they lie on a specific lower-dimensional structure in the space.

Consider the illustration in Fig. 1.1, where several three-dimensional points exist in $\mathbb{R}^3$. These points can represent any measurement, such as personal health measurements, including blood pressure, blood sugar, and blood fat. As demonstrated in Fig. 1.1, the points of the dataset have a structure in a two-dimensional space. The three-dimensional Euclidean space is called the input space, and the twodimensional space, which has a lower dimensionality than the input space, is called the subspace, the submanifold, or the embedding space. The subspace can be either linear or nonlinear, depending on whether a linear (hyper)plane passes through the points. Usually, subspace and submanifold are used for linear and nonlinear lowerdimensional spaces, respectively. Linear and nonlinear subspaces are depicted in Fig. 1.1a and b, respectively.

Whether the points of a dataset lie on a space is a hypothesis, but this hypothesis is usually true because the data points typically represent a natural signal, such as an image. When the data acquisition process is natural, the data will have a define structure. For example, in the dataset where there are multiple images from different angles depicting the same scene, the objects of the scene remain the same, but the point of view changes (see Fig. 1.2). This hypothesis is called the manifold hypothesis [14]. Its formal definition is as follows. According to the manifold hypothesis, data points of a dataset lie on a submanifold or subspace with lower dimensionality. In other words, the dataset in $\mathbb{R}^d$ lies on an embedded submanifold [38] with local dimensionality less than $d$ [14]. According to this hypothesis, the data points most often lie on a submanifold with high probability [64].

## 机器学习代写|流形学习代写manifold data learning代考|Feature Engineering

Due to the manifold hypothesis, a dataset can be compressed while preserving most of the important information. Therefore, engineering and processing can be applied to the features for the sake of compression [4]. Feature engineering can be seen as a preprocessing stage, where the dimensionality of the data is reduced. Assume $d$ and $p$ denote the dimensionality of the input space and the subspace, respectively, where $p \in(0, d]$. Feature engineering is a map from a $d$-dimensional Euclidean space to a $p$-dimensional Euclidean space, i.e., $\mathbb{R}^d \rightarrow \mathbb{R}^p$. The dimensionality of the subspace is usually much smaller than the dimensionality of the space, i.e. $p \ll d$, because most of the information usually exists in only a few features.

Feature engineering is divided into two broad approaches-feature selection and feature extraction [22]. In feature selection, the $p$ most informative features of the $d$-dimensional data vector are selected so the features of the transformed data points are a subset of the original features. In feature extraction, however, the $d$-dimensional data vector is transformed to a $p$-dimensional data vector, where the $p$ new features are completely different from the original features. In other words, data points are represented in another lower-dimensional space. Both feature selection and feature extraction are used for compression, which results in either the better discrimination of classes or better representation of data. In other words, the compressed data by feature engineering may have a better representation of the data or may separate the classes of data. This book concentrates on feature extraction.

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 机器学习代写|流形学习代写manifold data learning代考|SCl 7314

statistics-lab™ 为您的留学生涯保驾护航 在代写流形学习manifold data learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写流形学习manifold data learning代写方面经验极为丰富，各种代写流形学习manifold data learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 机器学习代写|流形学习代写manifold data learning代考|Curves and Geodesics

If the Riemannian manifold $(\mathcal{M}, g)$ is connected, it is a metric space with an induced topology that coincides with the underlying manifold topology. We can, therefore, define a function $d^{\mathcal{M}}$ on $\mathcal{M}$ that calculates distances between points on $\mathcal{M}$ and determines its structure.

Let $\mathbf{p}, \mathbf{q} \in \mathcal{M}$ be any two points on the Riemannian manifold $\mathcal{M}$. We first define the length of a (one-dimensional) curve in $\mathcal{M}$ that joins $\mathbf{p}$ to $\mathbf{q}$, and then the length of the shortest such curve.

A curve in $\mathcal{M}$ is defined as a smooth mapping from an open interval $\Lambda$ (which may have infinite length) in $\Re$ into $\mathcal{M}$. The point $\lambda \in \Lambda$ forms a parametrization of the curve. Let $c(\lambda)=\left(c_{1}(\lambda), \cdots, c_{d}(\lambda)\right)^{\top}$ be a curve in $\Re^{d}$ parametrized by $\lambda \in \Lambda \subseteq \Re$. If we take the coordinate functions, $\left{c_{h}(\lambda)\right}$, of $c(\lambda)$ to be as smooth as needed (usually, $\mathcal{C}^{\infty}$, functions that have any number of continuous derivatives), then we say that $c$ is a smooth curve. If $c(\lambda+\alpha)=c(\lambda)$ for all $\lambda, \lambda+\alpha \in \Lambda$, the curve $c$ is said to be closed. The velocity (or tangent) vector at the point $\lambda$ is given by
$$c^{\prime}(\lambda)=\left(c_{1}^{\prime}(\lambda), \cdots, c_{d}^{\prime}(\lambda)\right)^{\tau},$$
where $c_{j}^{\prime}(\lambda)=d c_{j}(\lambda) / d \lambda$, and the “speed” of the curve is
$$\left|c^{\prime}(\lambda)\right|=\left{\sum_{j=1}^{d}\left[c_{j}^{\prime}(\lambda)\right]^{2}\right}^{1 / 2}$$
Distance on a smooth curve $c$ is given by arc-length, which is measured from a fixed point $\lambda_{0}$ on that curve. Usually, the fixed point is taken to be the origin, $\lambda_{0}=0$, defined to be one of the two endpoints of the data. More generally, the arc-length $L(c)$ along the curve $c(\lambda)$ from point $\lambda_{0}$ to point $\lambda_{1}$ is defined as
$$L(c)=\int_{\lambda_{0}}^{\lambda_{1}}\left|c^{\prime}(\lambda)\right| d \lambda .$$

## 机器学习代写|流形学习代写manifold data learning代考|Linear Manifold Learning

Most statistical theory and applications that deal with the problem of dimensionality reduction are focused on linear dimensionality reduction and, by extension, linear manifold learning. A linear manifold can be visualized as a line, a plane, or a hyperplane, depending upon the number of dimensions involved. Data are observed in some high-dimensional space and it is usually assumed that a lower-dimensional linear manifold would be the most appropriate summary of the relationship between the variables. Although data tend not to live on a linear manifold, we view the problem as having two kinds of motivations. The first such motivation is to assume that the data live close to a linear manifold, the distance off the manifold determined by a random error (or noise) component. A second way of thinking about linear manifold learning is that a linear manifold is really a simple linear approximation to a more complicated type of nonlinear manifold that would probably be a better fit to the data. In both scenarios, the intrinsic dimensionality of the linear manifold is taken to be much smaller than the dimensionality of the data.

Identifying a linear manifold embedded in a higher-dimensional space is closely related to the classical statistics problem of linear dimensionality reduction. The recommended way of accomplishing linear dimensionality reduction is to create a reduced set of linear transformations of the input variables. Linear transformations are projection methods, and so the problem is to derive a sequence of low-dimensional projections of the input data that possess some type of optimal properties.

There are many techniques that can be used for either linear dimensionality reduction or linear manifold learning. In this chapter, we describe only two linear methods, namely, principal component analysis and multidimensional scaling. The earliest projection method was principal component analysis (dating back to 1933), and this technique has become the most popular dimensionality-reducing technique in use today. A related method is that of multidimensional scaling (dating back to 1952), which has a very different motivation. An adaptation of multidimensional scaling provided the core element of the IsOMAP algorithm for nonlinear manifold learning.

## 机器学习代写|流形学习代写manifold data learning代考|Curves and Geodesics

lleft{c_{h}(Nambda)\right }，的 $c(\lambda)$ 尽可能平滑（通常， $\mathcal{C}^{\infty}$ ，具有任意数量的连续导数的函数），那么我们说 $c$ 是 一条平滑曲线。如果 $c(\lambda+\alpha)=c(\lambda)$ 对所有人 $\lambda, \lambda+\alpha \in \Lambda$, 曲线 $c$ 据说是关闭的。该点的速度 (或切线) 矢 量 $\lambda$ 是 (谁) 给的
$$c^{\prime}(\lambda)=\left(c_{1}^{\prime}(\lambda), \cdots, c_{d}^{\prime}(\lambda)\right)^{\tau}$$

$$L(c)=\int_{\lambda_{0}}^{\lambda_{1}}\left|c^{\prime}(\lambda)\right| d \lambda .$$

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 机器学习代写|流形学习代写manifold data learning代考|INFS6077

statistics-lab™ 为您的留学生涯保驾护航 在代写流形学习manifold data learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写流形学习manifold data learning代写方面经验极为丰富，各种代写流形学习manifold data learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 机器学习代写|流形学习代写manifold data learning代考|Topological Spaces

Topological spaces were introduced by Maurice Fréchet (1906) (in the form of metric spaces), and the idea was developed and extended over the next few decades. Amongst those who contributed significantly to the subject was Felix Hausdorff, who in 1914 coined the phrase “topological space” using Johann Benedict Listing’s German word Topologie introduced in $1847 .$

A topological space $\mathcal{X}$ is a nonempty collection of subsets of $\mathcal{X}$ which contains the empty set, the space itself, and arbitrary unions and finite intersections of those sets. A topological space is often denoted by $(\mathcal{X}, \mathcal{T})$, where $\mathcal{T}$ represents the topology associated with $\mathcal{X}$. The elements of $\mathcal{T}$ are called the open sets of $\mathcal{X}$, and a set is closed if its complement is open. Topological spaces can also be characterized through the concept of neighborhood. If $\mathbf{x}$ is a point in a topological space $\mathcal{X}$, its neighborhood is a set that contains an open set that contains $\mathbf{x}$.
Let $\mathcal{X}$ and $\mathcal{Y}$ be two topological spaces, and let $U \subset \mathcal{X}$ and $V \subset \mathcal{Y}$ be open subsets. Consider the family of all cartesian products of the form $U \times V$. The topology formed from these products of open subsets is called the product topology for $\mathcal{X} \times \mathcal{Y}$. If $W \subset \mathcal{X} \times \mathcal{Y}$, then $W$ is open relative to the product topology iff for each point $(x, y) \in \mathcal{X} \times \mathcal{Y}$ there are open neighborhoods, $U$ of $x$ and $V$ of $y$, such that $U \times V \subset W$. For example, the usual topology for $d$-dimensional Euclidean space $\Re^{d}$ consists of all open sets of points in $\Re^{d}$, and this topology is equivalent to the product topology for the product of $d$ copies of $\Re$.

One of the core elements of manifold learning involves the idea of “embedding” one topological space inside another. Loosely speaking, the space $\mathcal{X}$ is said to be embedded in the space $\mathcal{Y}$ if the topological properties of $\mathcal{Y}$ when restricted to $\mathcal{X}$ are identical to the topological properties of $\mathcal{X}$. To be more specific, we state the following definitions. A function $g: \mathcal{X} \rightarrow \mathcal{Y}$ is said to be continuous if the inverse image of an open set in $\mathcal{Y}$ is an open set in $\mathcal{X}$. If $g$ is a bijective (i.e., one-to-one and onto) function such that $g$ and its inverse $g^{-1}$ are continuous, then $g$ is said to be a homeomorphism. Two topological spaces $\mathcal{X}$ and $\mathcal{Y}$ are said to be homeomorphic (or topologically equivalent) if there exists a homeomorphism from one space onto the other. A topological space $\mathcal{X}$ is said to be embedded in a topological space $\mathcal{Y}$ if $\mathcal{X}$ is homeomorphic to a subspace of $\mathcal{Y}$.

## 机器学习代写|流形学习代写manifold data learning代考|Riemannian Manifolds

In the entire theory of topological manifolds, there is no mention of the use of calculus. However, in a prototypical application of a “manifold,” calculus enters in the form of a “smooth” (or differentiable) manifold $\mathcal{M}$, also known as a Riemannian manifold; it is usually defined in differential geometry as a submanifold of some ambient (or surrounding) Euclidean space, where the concepts of length, curvature, and angle are preserved, and where smoothness relates to differentiability. The word manifold (in German, Mannigfaltigkeit) was coined in an “intuitive” way and without any precise definition by Georg Friedrich Bernhard Riemann (1826-1866) in his 1851 doctoral dissertation (Riemann, 1851; Dieudonné, 2009); in 1854, Riemann introduced in his famous Habilitations lecture the idea of a topological manifold on which one could carry out differential and integral calculus.

A topological manifold $\mathcal{M}$ is called a smooth (or differentiable) manifold if $\mathcal{M}$ is continuously differentiable to any order. All smooth manifolds are topological manifolds, but the reverse is not necessarily true. (Note: Authors often differ on the precise definition of a “smooth” manifold.)

We now define the analogue of a homeomorphism for a differentiable manifold. Consider two open sets, $U \in \Re^{r}$ and $V \in \Re^{s}$, and let $g: U \rightarrow V$ so that for $\mathbf{x} \in U$ and $\mathbf{y} \in V, g(\mathbf{x})=$ y. If the function $g$ has finite first-order partial derivatives, $\partial y_{j} / \partial x_{i}$, for all $i=1,2, \ldots, r$, and all $j=1,2, \ldots, s$, then $g$ is said to be a smooth (or differentiable) mapping on $U$. We also say that $g$ is a $\mathcal{C}^{1}$-function on $U$ if all the first-order partial derivatives are continuous. More generally, if $g$ has continuous higher-order partial derivatives, $\partial^{k_{1}+\cdots+k_{r}} y_{j} / \partial x_{1}^{k_{1}} \cdots \partial x_{r}^{k_{r}}$, for all $j=1,2, \ldots, s$ and all nonnegative integers $k_{1}, k_{2}, \ldots, k_{r}$ such that $k_{1}+k_{2}+\cdots+k_{r} \leq r$, then we say that $g$ is a $\mathcal{C}^{\top}$-function, $r=1,2, \ldots$. If $g$ is a $\mathcal{C}^{r}$-function for all $r \geq 1$, then we say that $g$ is a $\mathcal{C}^{\infty}$-function.

If $g$ is a homeomorphism from an open set $U$ to an open set $V$, then it is said to be a $\mathcal{C}^{r}$-diffeomorphism if $g$ and its inverse $g^{-1}$ are both $\mathcal{C}^{r}$-functions. A $\mathcal{C}^{\infty}$-diffeomorphism is simply referred to as a diffeomorphism. We say that $U$ and $V$ are diffeomorphic if there exists a diffeomorphism between them. These definitions extend in a straightforward way to manifolds. For example, if $\mathcal{X}$ and $\mathcal{Y}$ are both smooth manifolds, the function $g: \mathcal{X} \rightarrow \mathcal{Y}$ is a diffeomorphism if it is a homeomorphism from $\mathcal{X}$ to $\mathcal{Y}$ and both $g$ and $g^{-1}$ are smooth. Furthermore, $\mathcal{X}$ and $\mathcal{Y}$ are diffeomorphic if there exists a diffeomorphism between them, in which case, $\mathcal{X}$ and $\mathcal{Y}$ are essentially indistinguishable from each other.

## 机器学习代写|流形学习代写manifold data learning代考|Topological Spaces

Maurice Fréchet (1906) 引入了拓扑空间（以度量空间的形式），这个想法在接下来的几十年中得到发展和扩 展。对这个主题做出重大贡献的人中有 Felix Hausdorff，他在 1914 年使用 Johann Benedict Listing 的德语单词 Topologie 创造了“拓扑空间”一词。1847.

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 机器学习代写|流形学习代写manifold data learning代考|EECS 559a

statistics-lab™ 为您的留学生涯保驾护航 在代写流形学习manifold data learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写流形学习manifold data learning代写方面经验极为丰富，各种代写流形学习manifold data learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 机器学习代写|流形学习代写manifold data learning代考|Spectral Embedding Methods for Manifold Learning

Manifold learning encompasses much of the disciplines of geometry, computation, and statistics, and has become an important research topic in data mining and statistical learning. The simplest description of manifold learning is that it is a class of algorithms for recovering a low-dimensional manifold embedded in a high-dimensional ambient space. Major breakthroughs on methods for recovering low-dimensional nonlinear embeddings of highdimensional data (Tenenbaum, de Silva, and Langford, 2000; Roweis and Saul, 2000) led to the construction of a number of other algorithms for carrying out nonlinear manifold learning and its close relative, nonlinear dimensionality reduction. The primary tool of all embedding algorithms is the set of eigenvectors associated with the top few or bottom few eigenvalues of an appropriate random matrix. We refer to these algorithms as spectral embedding methods. Spectral embedding methods are designed to recover linear or nonlinear manifolds, usually in high-dimensional spaces.

Linear methods, which have long been considered part-and-parcel of the statistician’s toolbox, include PRINCIPAL COMPONENT ANALYSIS (PCA) and MULTIDIMENSIONAL SCALING (MDS). PCA has been used successfully in many different disciplines and applications. In computer vision, for example, PCA is used to study abstract notions of shape, appearance, and motion to help solve problems in facial and object recognition, surveillance, person tracking, security, and image compression where data are of high dimensionality (Turk and Pentland, 1991; De la Torre and Black, 2001). In astronomy, where very large digital sky surveys have become the norm, PCA has been used to analyze and classify stellar spectra, carry out morphological and spectral classification of galaxies and quasars, and analyze images of supernova remnants (Steiner, Menezes, Ricci, and Oliveira, 2009). In bioinformatics, PCA has been used to study high-dimensional data generated by genome-wide, gene-expression experiments on a variety of tissue sources, where scatterplots of the top principal components in such studies often show specific classes of genes that are expressed by different clusters of distinctive biological characteristics (Yeung and Ruzzo, 2001; ZhengBradley, Rung, Parkinson, and Brazma, 2010). PCA has also been used to select an optimal subset of single nucleotide polymorphisms (SNPs) (Lin and Altman, 2004). PCA is also used to derive approximations to more complicated nonlinear subspaces, including problems involving data interpolation, compression, denoising, and visualization.

## 机器学习代写|流形学习代写manifold data learning代考|Spaces and Manifolds

Manifold learning involves concepts from general topology and differential geometry. Good introductions to topological spaces include Kelley (1955), Willard (1970), Bourbaki (1989), Mendelson (1990), Steen (1995), James (1999), and several of these have since been reprinted. Books on differential geometry include Spivak (1965), Kreyszig (1991), Kühnel (2000), Lee (2002), and Pressley (2010).

Manifolds generalize the notions of curves and surfaces in two and three dimensions to higher dimensions. Before we give a formal description of a manifold, it will be helpful to visualize the notion of a manifold. Imagine an ant at a picnic, where there are all sorts of items from cups to doughnuts. The ant crawls all over the picnic items, but because of its tiny size, the ant sees everything on a very small scale as flat and featureless. Similarly, a human, looking around at the immediate vicinity, would not see the curvature of the earth. A manifold (also referred to as a topological manifold) can be thought of in similar terms, as a topological space that locally looks flat and featureless and behaves like Euclidean space. Unlike a metric space, a topological space has no concept of distance. In this Section, we review specific definitions and ideas from topology and differential geometry that enable us to provide a useful definition of a manifold.

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 机器学习代写|流形学习代写manifold data learning代考|Preserving the Estimated Density

statistics-lab™ 为您的留学生涯保驾护航 在代写流形学习manifold data learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写流形学习manifold data learning代写方面经验极为丰富，各种代写流形学习manifold data learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 机器学习代写|流形学习代写manifold data learning代考|The Optimization

Now that we have a method to estimate the density on a submanifold of $\mathbb{R}^{D}$, we can proceed to define an algorithm for density preserving maps. ${ }^{9}$ Suppose we are given a sample $X=\left{x_{1}, x_{2}, \ldots, x_{m}\right}$ of $m$ data points $x_{i} \in \mathbb{R}^{D}$ that live on a $d$-dimensional submanifold $M$ of $\mathbb{R}^{D}$. We first proceed to estimate the density at each one of the points, by using a slightly generalized version of the submanifold estimator that has variable bandwidths. Denoting the bandwidth for a given evaluation point $x_{j}$ and a reference (data) point $x_{i}$ by $h_{i j}$, the generalized, variable bandwidth estimator at $x_{j}$ is, ${ }^{10}$
$$\hat{f_{j}}=\hat{f}\left(x_{j}\right)=\frac{1}{m} \sum_{i} \frac{1}{h_{i j}^{d}} K_{d}\left(\frac{\left|x_{j}-x_{i}\right|_{D}}{h_{i j}}\right) .$$
Variable bandwidth methods allow the estimator to adapt to the inhomogeneities in the data. Various approaches exist for picking the bandwidths $h_{i j}$ as functions of the query (evaluation) point $x_{j}$ and/or the reference point $x_{i}[25]$. Here, we focus on the $k$ th-nearest neighbor approach for evaluation points, i.e., we take $h_{i j}$ to depend only on the evaluation point $x_{j}$, and we let $h_{i j}=h_{j}=$ the distance of the $k$ th nearest data (reference) point to the evaluation point $x_{j}$. Here, $k$ is a free parameter that needs to be picked by the user. However, instead of tuning it by hand, one can use a leave-one-out cross-validation score [25] such as the log-likelihood score for the density estimate to pick the best value. This is done by estimating the log-likelihood of each data point by using the leave-one-out version

of the density estimate $(3.7)$ for a range of $k$ values, and picking the $k$ that gives the highest log-likelihood.

Now, given the estimates $\hat{f}{j}=\hat{f}\left(x{j}\right)$ of the submanifold density at the $D$-dimensional data points $x_{j}$, we want to find a $d$-dimensional representation $X^{\prime}=\left{x_{1}^{\prime}, x_{2}^{\prime}, \ldots, x_{m}^{\prime}\right}$, $x_{i}^{\prime} \in \mathbb{R}^{d}$ such that the new estimates $\hat{f}{i}^{\prime}$ at the points $x{i}^{\prime} \in \mathbb{R}^{d}$ agree with the original density estimates, i.e.,
$$\hat{f}{i}^{\prime}=\hat{f}{i}, \quad i=1, \ldots, m .$$
For this purpose, one can attempt, for example, to minimize the mean squared deviation of $\hat{f}{i}^{\prime}$ from $\hat{f}{i}$ as a function of the $x_{i}^{\prime}$ s, but such an approach would result in a non-convex optimization problem with many local minima. We formulate an alternative approach involving semidefinite programming, for the special case of the Epanechnikov kernel [25], which is known to be asymptotically optimal for density estimation, and is convenient for formulating a convex optimization problem for the matrix of inner products (the Gram matrix, or the kernel matrix) of the low dimensional data set $X^{\prime}$.

## 机器学习代写|流形学习代写manifold data learning代考|The Optimization

The Epanechnikov kernel. The Epanechnikov kernel $k_{e}$ in $d$ dimensions is defined as,
$$k_{e}\left(\left|x_{i}-x_{j}\right|\right)=\left{\begin{array}{cc} N_{e}\left(1-\left|x_{i}-x_{j}\right|^{2}\right), & 0 \leq\left|x_{i}-x_{j}\right| \leq 1 \ 0, & 1 \leq\left|x_{i}-x_{j}\right| \end{array}\right.$$
where $N_{e}$ is the normalization constant that ensures $\int_{\mathbb{R}{d}} k{e}\left(\left|x-x^{\prime}\right|\right) d^{d} x^{\prime}=1$. We will assume that the kernel used in the estimates $\hat{f}{i}$ and $\hat{f}{i}^{\prime}$ of the density via (3.7) is the Epanechnikov kernel. Owing to its quadratic form (3.9), this kernel facilitates the formulation of a convex optimization problem. Instead of seeking the dimensionally reduced version $X^{\prime}=\left{x_{1}^{\prime}, \ldots, x_{n}^{\prime}\right}$ of the data set directly, we will first aim to obtain the kernel matrix $K_{i j}=x_{i}^{\prime} \cdot x_{j}^{\prime}$ for the low-dimensional data points. This is a common approach in the manifold learning literature, where one obtains the low-dimensional data points themselves from the $K_{i j}$ via a singular value decomposition.

We next formulate the DPM optimization problem using the Epanechnikov kernel, and comment on the motivation behind it. As in the case of distance-based manifold learning methods, there will likely be various approaches to density-preserving dimensional reduction, some computationally more efficient than the one discussed here. We hope the discussions in this chapter will stimulate further research in this area.

Given the estimated densities $\hat{f}{i}$, we seek a symmetric, positive semidefinite inner product matrix $K{i j}=x_{i}^{\prime} \cdot x_{j}^{\prime}$ that results in $d$-dimensional density estimates that agree with $\hat{f}_{i}$. In order to deal with the non-uniqueness problem mentioned during our discussion of densitypreserving maps between manifolds (which likely carries over to the discrete setting), we need to pick a suitable objective function to maximize. We choose the objective function to be the same as that of Maximum Variance Unfolding (MVU) [29], namely, $\operatorname{trace}(K)$. After getting rid of translations by constraining the center of mass of the dimensionally reduced data points to the origin, maximizing the objective function trace $(K)$ becomes equivalent to maximizing the sum of the squared distances between the data points [29].

While the objective function for DPM is the same as that of MVU, the constraints of the former will be weaker. Instead of preserving the distances between $k$-nearest neighbors, the DPM optimization defined below preserves the total contribution of the original $k$-nearest neighbors to the density estimate at the data points. As opposed to MVU, this allows for local stretches of the data set, and results in optimal kernel matrices $K$ that can be faithfully represented by a smaller number of dimensions than the intrinsic dimensionality suggested by MVU. For instance, while MVU is capable of unrolling data on the Swiss roll onto a flat plane, it is impossible to lay data from a spherical cap onto the plane while keeping the distances to the $k$ th nearest neighbors fixed. ${ }^{11}$ Thus, the constraints of the optimization in MVU are too stringent to give an inner product matrix $K$ of rank 2, when the original data is on an intrinsically curved surface in $\mathbb{R}^{3}$. We will see below that the looser constraints of DPM allow it to do a better job in capturing the intrinsic dimensionality of a curved surface.

## 机器学习代写|流形学习代写manifold data learning代考|Summary

In this chapter, we discussed density preserving maps, a density-based alternative to distancebased methods of manifold learning. This method aims to perform dimensionality reduction on large-dimensional data sets in a way that preservs their density. By using a classical result due to Moser, we proved that density preserving maps to $\mathbb{R}^{d}$ exist even for data on intrinsically curved $d$-dimensional submanifolds of $\mathbb{R}^{D}$ that are globally, or topologically “simple.” Since the underlying probability density function is arguably one of the most fundamental statistical quantities pertaining to a data set, a method that preserves densities while performing dimensionality reduction is guaranteed to preserve much valuable structure in the data. While distance-preserving approaches distort data on intrinsically curved spaces in various ways, density preserving maps guarantee that certain fundamental statistical information is conserved.

We reviewed a method of estimating the density on a submanifold of Euclidean space. This method was a slightly modified version of the classical method of kernel density estimation, with the additional property that the convergence rate was determined by the intrinsic dimensionality of the data, instead of the full dimensionality of the Euclidean space the data was embedded in. We made a further modification on this estimator to allow for variable “bandwidths,” and used it with a specific kernel function to set up a semidefinite optimization problem for a proof-of-concept approach to density preserving maps. The objective function used was identical to the one in Maximum Variance Unfolding [29], but the constraints were significantly weaker than the distance-preserving constraints in MVU. By testing the methods on two relatively small, synthetic data sets, we experimentally confirmed the theoretical expectations and showed that density preserving maps are better in detecting and reducing to the intrinsic dimensionality of the data than some of the commonly used distance-based approaches that also work by first estimating a kernel matrix.
While the initial formulation presented in this chapter is not yet scalable to large data sets, we hope our discussion will motivate our readers to pursue the idea of density preserving maps further, and explore alternative, superior formulations. One possible approach to speeding up the computation is to use fast semidefinite programming techniques [4].

## 机器学习代写|流形学习代写manifold data learning代考|The Optimization

Fj^=F^(Xj)=1米∑一世1H一世jdķd(|Xj−X一世|DH一世j).

F^一世′=F^一世,一世=1,…,米.

## 机器学习代写|流形学习代写manifold data learning代考|The Optimization

Epanechnikov 内核。Epanechnikov 内核ķ和在d维度定义为
$$k_{e}\left(\left|x_{i}-x_{j}\right|\right)=\left{ñ和(1−|X一世−Xj|2),0≤|X一世−Xj|≤1 0,1≤|X一世−Xj|\对。$$

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 机器学习代写|流形学习代写manifold data learning代考| Density Estimation on Submanifolds

statistics-lab™ 为您的留学生涯保驾护航 在代写流形学习manifold data learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写流形学习manifold data learning代写方面经验极为丰富，各种代写流形学习manifold data learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 机器学习代写|流形学习代写manifold data learning代考|Introduction

Kernel density estimation (KDE) [21] is one of the most popular methods of estimating the underlying probability density function (PDF) of a data set. Roughly speaking, KDE consists of having the data points contribute to the estimate at a given point according to their distances from that point – closer the point, the bigger the contribution. More precisely, in the simplest multi-dimensional KDE [5], the estimate $\hat{f}{m}\left(\mathbf{y}{0}\right)$ of a PDF $f\left(\mathbf{y}{0}\right)$ at a point $\mathbf{y}{0} \in \mathbb{R}^{D}$ is given in terms of a sample $\left{\mathbf{y}{1}, \ldots, \mathbf{y}{m}\right}$ as,
$$\hat{f}{m}\left(\mathbf{y}{0}\right)=\frac{1}{m} \sum_{i=1}^{m} \frac{1}{h_{m}^{D}} K\left(\frac{\left|\mathbf{y}{i}-\mathbf{y}{0}\right|}{h_{m}}\right),$$
where $h_{m}>0$, the bandwidth, is chosen to approach to zero in a suitable manner as the number $m$ of data points increases, and $K:[0, \infty) \rightarrow[0, \infty)$ is a kernel function that satisfies certain properties such as boundedness. Various theorems exist on the different types and rates of convergence of the estimator to the correct result. The earliest result on the pointwise convergence rate in the multivariable case seems to be given in [5], where it is stated that under certain conditions for $f$ and $K$, assuming $h_{m} \rightarrow 0$ and $m h_{m}^{D} \rightarrow \infty$ as $m \rightarrow \infty$, the mean squared error in the estimate $\hat{f}\left(\mathbf{y}{0}\right)$ of the density at a point goes to zero with the rate, $$\operatorname{MSE}\left[\hat{f}{m}\left(\mathbf{y}{0}\right)\right]=\mathrm{E}\left[\left(\hat{f}{m}\left(\mathbf{y}{0}\right)-f\left(\mathbf{y}{0}\right)\right)^{2}\right]=O\left(h_{m}^{4}+\frac{1}{m h_{m}^{D}}\right)$$
as $m \rightarrow \infty$. If $h_{m}$ is chosen to be proportional to $m^{-1 /(D+4)}$, one gets,
$$\operatorname{MSE}\left[\hat{f}{m}(p)\right]=O\left(\frac{1}{m^{4 /(D+4)}}\right),$$ as $m \rightarrow \infty$. The two conditions $h{m} \rightarrow 0$ and $m h_{m}^{D} \rightarrow \infty$ ensure that, as the number of data points increases, the density estimate at a point is determined by the values of the density in a smaller and smaller region around that point, but the number of data points contributing to the estimate (which is roughly proportional to the volume of a region of size $h_{m}$ ) grows unboundedly, respectively.

## 机器学习代写|流形学习代写manifold data learning代考|Motivation for the Submanifold Estimator

We would like to estimate the values of a PDF that lives on an (unknown) $d$-dimensional Riemannian submanifold $M$ of $\mathbb{R}^{D}$, where $d<D$. Usually, $D$-dimensional KDE does not work for such a distribution. This can be intuitively understood by considering a distribution on a line in the plane: 1-dimensional KDE performed on the line (with a bandwidth $h_{m}$ satisfying the asymptotics given above) would converge to the correct density on the line, but 2-dimensional KDE, differing from the former only by a normalization factor that blows up as the bandwidth $h_{m} \rightarrow 0$ (compare (3.1) for the cases $D=2$ and $D=1$ ), diverges. This behavior is due to the fact that, similar to a “delta function” distribution on $\mathbb{R}$, the $D$-dimensional density of a distribution on a $d$-dimensional submanifold of $\mathbb{R}^{D}$ is, strictly speaking, undefined – the density is zero outside the submanifold, and in order to have proper normalization, it has to be infinite on the submanifold. More formally, the $D$ dimensional probability measure for a $d$-dimensional PDF supported on $M$ is not absolutely continuous with respect to the Lebesgue measure on $\mathbb{R}^{D}$, and does not have a probability

density function on $\mathbb{R}^{D}$. If one attempts to use $D$-dimensional KDE for data drawn from such a probability measure, the estimator will “attempt to converge” to a singular PDF; one that is infinite on $M$, zero outside.

For a distribution with support on a line in the plane, we can resort to 1-dimensional KDE to get the correct density on the line, but how could one estimate the density on an unknown, possibly curved submanifold of dimension $d<D$ ? Essentially the same approach works: even for data that lives on an unknown, curved $d$-dimensional submanifold of $\mathbb{R}^{D}$, it suffices to use the $d$-dimensional kernel density estimator with the Euclidean distance on $\mathbb{R}^{D}$ to get a consistent estimator of the submanifold density. Furthermore, the convergence rate of this estimator can be bounded as in (3.3), with $D$ being replaced by $d$, the intrinsic dimension of the submanifold. [20]

The intuition behind this approach is based on three facts: 1) For small bandwidths, the main contribution to the density estimate at a point comes from data points that are nearby; 2) For small distances, a $d$-dimensional Riemannian manifold “looks like” $\mathbb{R}^{d}$, and densities in $\mathbb{R}^{d}$ should be estimated by a $d$-dimensional kernel, instead of a $D$-dimensional one; and 3) For points of $M$ that are close to each other, the intrinsic distances as measured on $M$ are close to Euclidean distances as measured in the surrounding $\mathbb{R}^{D}$. Thus, as the number of data points increases and the bandwidth is taken to be smaller and smaller, estimating the density by using a kernel normalized for $d$ dimensions and distances as measured in $\mathbb{R}^{D}$ should give a result closer and closer to the correct value.

We will next give the formal definition of the estimator motivated by these considerations, and state the theorem on its asymptotics. As in the original work of Parzen [21], the pointwise consistence of the estimator can be proven by using a bias-variance decomposition. The asymptotic unbiasedness of the estimator follows from the fact that as the bandwidth converges to zero, the kernel function becomes a “delta function.” Using this fact, it is possible to show that with an appropriate choice for the vanishing rate of the bandwidth, the variance also vanishes asymptotically, completing the proof of the pointwise consistency of the estimator.

## 机器学习代写|流形学习代写manifold data learning代考|Statement of the Theorem

Let $(M, \mathbf{g})$ be a $d$-dimensional, embedded, complete, compact Riemannian submanifold of $\mathbb{R}^{D}(d0 .^{7}$ Let $d(p, q)=d_{p}(q)$ be the length of a length-minimizing geodesic in $M$ between $p, q \in M$, and let $u(p, q)=u_{p}(q)$ be the geodesic distance between $p$ and $q$ as measured in $\mathbb{R}^{D}$ (thus, $u(p, q)$ is simply the Euclidean distance between $p$ and $q$ in $\left.\mathbb{R}^{D}\right)$. Note that $u(p, q) \leq d(p, q)$. We will denote the Riemannian volume measure on $M$ by $V$, and the volume form by $d V .^{8}$
Theorem 3.3.1 Let $f: M \rightarrow[0, \infty)$ be a probability density function defined on $M$ (so that the related probability measure is $f V)$, and $K:[0, \infty) \rightarrow[0, \infty)$ be a continuous function that vanishes outside $[0,1)$, is differentiable with a bounded derivative in $[0,1)$, and satisfies the normalization condition, $\int_{|\mathbf{z}| \leq 1} K(|\mathbf{z}|) d^{d} \mathbf{z}=1$. Assume $f$ is differentiable to second order in a neighborhood of $p \in M$, and for a sample $q_{1}, \ldots, q_{m}$ of size $m$ drawn from the

density $f$, define an estimator $\hat{f}{m}(p)$ of $f(p)$ as, $$\hat{f}{m}(p)=\frac{1}{m} \sum_{j=1}^{m} \frac{1}{h_{m}^{d}} K\left(\frac{u_{p}\left(q_{j}\right)}{h_{m}}\right)$$
where $h_{m}>0$. If $h_{m}$ satisfies $\lim {m \rightarrow \infty} h{m}=0$ and $\lim {m \rightarrow \infty} m h{m}^{d}=\infty$, then, there exist non-negative numbers $m_{}, C_{b}$, and $C_{V}$ such that for all $m>m_{}$ the mean squared error of the estimator (3.4) satisfies,
$$\operatorname{MSE}\left[\hat{f}{m}(p)\right]=\mathrm{E}\left[\left(\hat{f}{m}(p)-f(p)\right)^{2}\right]<C_{b} h_{m}^{4}+\frac{C_{V}}{m h_{m}^{d}}$$
If $h_{m}$ is chosen to be proportional to $m^{-1 /(d+4)}$, this gives,
$$\mathrm{E}\left[\left(f_{m}(p)-f(p)\right)^{2}\right]=O\left(\frac{1}{m^{4 /(d+4)}}\right)$$
as $m \rightarrow \infty$.
Thus, the bound on the convergence rate of the submanifold density estimator is as in (3.2), (3.3), with the dimensionality $D$ replaced by the intrinsic dimension $d$ of $M$. As mentioned above, the proof of this theorem follows from two lemmas on the convergence rates of the bias and the variance; the $h_{m}^{4}$ term in the bound corresponds to the bias, and the $1 / m h_{m}^{d}$ term corresponds to the variance; see $[20]$ for details. This approach to submanifold density estimation was previously mentioned in [11], and the thesis [10] contains the details, although in a more technical and general approach than the elementary one followed in [20].

## 机器学习代写|流形学习代写manifold data learning代考|Introduction

F^米(是0)=1米∑一世=1米1H米Dķ(|是一世−是0|H米),

MSE⁡[F^米(p)]=这(1米4/(D+4)),作为米→∞. 两个条件H米→0和米H米D→∞确保随着数据点数量的增加，一个点的密度估计值由该点周围越来越小的区域中的密度值决定，但对估计值有贡献的数据点数量（大致成正比）到一个大小区域的体积H米) 分别无限增长。

## 机器学习代写|流形学习代写manifold data learning代考|Statement of the Theorem

MSE⁡[F^米(p)]=和[(F^米(p)−F(p))2]<CbH米4+C在米H米d

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。

## 机器学习代写|流形学习代写manifold data learning代考|The Existence of Density Preserving Maps

statistics-lab™ 为您的留学生涯保驾护航 在代写流形学习manifold data learning方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写流形学习manifold data learning代写方面经验极为丰富，各种代写流形学习manifold data learning相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 机器学习代写|流形学习代写manifold data learning代考|Moser’s Theorem

Riemannian manifolds. We begin by restricting our attention to data subspaces which are Riemannian submanifolds of $\mathbb{R}^{D}$. Riemannian manifolds provide a generalization of the notion of a smooth surface in $\mathbb{R}^{3}$ to higher dimensions. As first clarified by Gauss in the two-dimensional case and by Riemann in the general case, it turns out that intrinsic features of the geometry of a surface, such as the lengths of its curves or intrinsic distances between its points, etc., can be given in terms of the so-called metric tensor ${ }^{2} \mathrm{~g}$, without referring to the particular way the the surface is embedded in $\mathbb{R}^{3}$. A space whose geometry is defined in terms of a metric tensor is called a Riemannian manifold (for a rigorous definition, see, e.g., $[12,16,2])$.

The Gauss/Riemann result mentioned above states that if the intrinsic curvature of a Riemannian manifold $\left(M, \mathbf{g}_{M}\right)$ is not zero in an open set $U \in M$, it is not possible to find

a map from $M$ into $\mathbb{R}^{d}$ that preserves the distances between the points of $U$. Thus, there exists a local obstruction, namely, the curvature, to the existence of distance-preserving maps. It turns out that no such local obstruction exists for volume-preserving maps. The only invariant is a global one, namely, the total volume. ${ }^{3}$ This is the content of Moser’s theorem on volume-preserving maps, which we state next.

Theorem 3.2.1 (Moser [18]) Let $\left(M, \mathbf{g}{M}\right)$ and $\left(N, \mathbf{g}{N}\right)$ be two closed, connected, orientable, $d$-dimensional differentiable manifolds that are diffeomorphic to each other. Let $\tau_{M}$ and $\tau_{N}$ be volume forms, i.e., nowhere vanishing $d$-forms on these manifolds, satisfying $\int_{M} \tau_{M}=$ $\int_{N} \tau_{N}$. Then, there exists a diffeomorphism $\phi: M \rightarrow N$ such that $\tau_{M}=\phi^{*} \tau_{N}$, i.e., the volume form on $M$ is the same as the pull-back of the volume form on $N$ by $\phi .^{4}$

The meaning of this result is that, if two manifolds with the same “global shape” (i.e., two manifolds that are diffeomorphic) have the same total volume, one can find a map between them that preserves the volume locally. The surfaces of a mug and a torus are the classical examples used for describing global, topological equivalence. Although these objects have the same “global shape” (topology/smooth structure) their intrinsic, local geometries are different. Moser’s theorem states that if their total surface areas are the same, one can find a map between them that preserves the areas locally, as well, i.e., a map that sends all small regions on one surface to regions in the other surface in a way that preserves the areas.

Using this theorem, we now show that it is possible to find density-preserving maps between Riemannian manifolds that have the same total volume. This is due to the fact that if local volumes are preserved under a map, the density of a distribution will also be preserved.

## 机器学习代写|流形学习代写manifold data learning代考|Dimensional Reduction

These results were formulated in terms of so-called closed manifolds, i.e., compact manifolds without boundary. The practical dimensionality reduction problem we would like to address, on the other hand, involves starting with a $d$-dimensional data submanifold $M$ of $\mathbb{R}^{D}$ (where $d<D)$, and dimensionally reducing to $\mathbb{R}^{d}$. In order to be able to do this diffeomorphically, $M$ must be diffeomorphic to a subspace of $\mathbb{R}^{d}$, which is not generally the case for closed manifolds. For instance, although we can find a diffeomorphism from a hemisphere (a manifold with boundary, not a closed manifold) into the plane, we cannot find one from the unit sphere (a closed manifold) into the plane. This is a constraint on all dimensional reduction algorithms that preserve the global topology of the data space, not just density preserving maps. Any algorithm that aims to avoid “tearing” or “folding” the data subspace during the reduction will fail on problems like reducing a sphere to $\mathbb{R}^{2.5}$

Thus, in order to show that density preserving maps into $\mathbb{R}^{d}$ exist for a useful class of $d$-dimensional data manifolds, we have to make sure that the conclusion of Moser’s theorem and our corollary work for certain manifolds with boundary, or for certain non-compact manifolds, as well. Fortunately, this is not so hard, at least for a simple class of manifolds that is enough to be useful. In proving his theorem for closed manifolds, Moser [18] first gives a proof for a single “coordinate patch” in such a manifold, which, basically, defines a compact manifold with boundary minus the boundary itself. Not all $d$-dimensional manifolds with boundary (minus their boundaries) can be given by atlases consisting of a single coordinate patch, but the ones that can be so given cover a wide range of curved Riemannian manifolds, including the hemisphere and the Swiss roll, possibly with punctures. In the following, we will assume that $M$ consists of a single coordinate patch.

## 机器学习代写|流形学习代写manifold data learning代考|Intuition on Non-Uniqueness

Note that the results above claim the existence of volume (or density) preserving maps, but not uniqueness. In fact, the space of volume-preserving maps is very large. An intuitive way to see this is to consider the flow of an incompressible fluid in $\mathbb{R}^{3}$. The fluid may cover the same region in space at two given times, but the fluid particles may have gone through significant shuffling. The map from the original configuration of the fluid to the final one is a volume preserving diffeomorphism, assuming the flow is smooth. The infinity of ways a fluid can move shows the infinity of ways of preserving volume.

Distance-preserving maps may also have some non-uniqueness, but this is parametrized by a finite-dimensional group, namely, the isometry group of the Riemannian manifold under consideration. ${ }^{6}$ The case of volume-preserving maps is much worse, the space of volumepreserving diffeomorphisms being infinite-dimensional. Since the aim of this chapter is to describe a manifold-learning method that preserves volumes/densities, we are faced with the following question: Given a data manifold with intrinsic dimension $d$ that is diffeomorphic to a subset of $\mathbb{R}^{d}$, which map, in the infinite-dimensional space of volume-preserving maps from this manifold to $\mathbb{R}^{d}$, is the “best”? In Section 3.4, we will describe an approach to this problem by setting up a specific optimization procedure. But first, let us describe a method for estimating densities on submanifolds.

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。