## 统计代写|主成分分析代写Principal Component Analysis代考|Examples of Mixed Data Modeling

The problem of modeling mixed data is quite representative of many data sets that one often encounters in practical applications. To further motivate the importance of modeling mixed data, we give below a few real-world problems that arise in image processing and computer vision. Most of these problems will be revisited later in this book, and more detailed and principled solutions will be given.
Face Clustering under Varying Illumination
The first example arises in the context of image-based face clustering. Given a collection of unlabeled images $\left{I_{j}\right}_{j=1}^{N}$ of several different faces taken under varying illumination, we would like to cluster the images corresponding to the face of the same person. For a Lambertian object, ${ }^{10}$ it has been shown that the set of all images taken under all lighting conditions forms a cone in the image space, which can be well approximated by a low-dimensional subspace called the “illumination subspace” (Belhumeur and Kriegman 1998; Basri and Jacobs 2003).” For example, if $I_{j}$ is the $j$ th image of a face and $d$ is the dimension of the illumination subspace associated with that face, then there exists a mean face $\mu$ and $d$ eigenfaces $\boldsymbol{u}{1}, \boldsymbol{u}{2}, \ldots, \boldsymbol{u}{d}$ such that $I{j} \approx \boldsymbol{\mu}+\boldsymbol{u}{1} y{1 j}+\boldsymbol{u}{2} y{2 j}+\cdots+\boldsymbol{u}{d j} y{d j}$. Now, since the images of different faces will live in different “illumination subspaces,” we can cluster the collection of images by estimating a basis for each one of those subspaces. As we will see later, this is a special case of the subspace clustering problem addressed in Part II of this book. In the example shown in Figure 1.3, we use a subset of the Yale Face Database B consisting of $n=64 \times 3$ frontal views of three faces (subjects 5,8 and 10 ) under 64 varying lighting conditions. For computational efficiency, we first down-sample each image to a size of $30 \times 40$ pixels. We then project the data onto their first three principal components using PCA, as shown in Figure $1.3$ (a). $.^{12}$ By modeling the projected data with a mixture model of linear subspaces in $\mathbb{R}^{3}$, we obtain three affine subspaces of dimension 2, 1, and 1, respectively. Despite the series of down-sampling and projection, the subspaces lead to a perfect clustering of the face images, as shown in Figure 1.3(b).

## 统计代写|主成分分析代写Principal Component Analysis代考|Mathematical Representations of Mixture Models

The examples presented in the previous subsection argue forcefully for the development of modeling and estimation techniques for mixture models. Obviously, whether the model associated with a given data set is mixed depends on the class of primitive models considered. In this book, the primitives are normally chosen to be simple classes of geometric models or probabilistic distributions.

For instance, one may choose the primitive models to be linear subspaces. Then one can use an arrangement of linear subspaces $\left{S_{i}^{}_{i=1}^{n}} \subset \mathbb{R}^{D}\right.$,
$$Z \doteq S_{1} \cup S_{2} \cup \cdots \cup S_{n},$$
also called a piecewise linear model, to approximate many nonlinear manifolds or piecewise smooth topological spaces. This is the standard model considered in geometric approaches to generalized principal component analysis (GPCA), which will be studied in Part II of this book.

The statistical counterpart to the geometric model in (1.7) is to assume instead that the sample points are drawn independently from a mixture of (near singular) Gaussian distributions $\left{p_{\theta_{i}}(\boldsymbol{x}){i=1}^{n}\right.$, where $\boldsymbol{x} \in \mathbb{R}^{D}$ but each distribution has mass concentrated near a subspace. The overall probability density function can be expressed as a sum: $$q{\theta}(x) \doteq \pi_{1} p_{\theta_{1}}(x)+\pi_{2} p_{\theta_{2}}(x)+\cdots+\pi_{n} p_{\theta_{n}}(x),$$
where $\theta=\left(\theta_{1}, \ldots, \theta_{n}, \pi_{1}, \ldots, \pi_{n}\right)$ are the model parameters and $\pi_{i}>0$ are mixing weights with $\pi_{1}+\pi_{2}+\cdots+\pi_{n}=1$. This is the typical model studied in mixtures of probabilistic principal component analysis (PPCA) (Tipping and Bishop $1999 \mathrm{a}$ ), where each component distribution $p_{\theta_{i}}(\boldsymbol{x})$ is a nearly degenerate Gaussian distribution. A classical way of estimating such a mixture model is the expectation maximization (EM) algorithm, where the membership of each sample is represented as a hidden random variable. Appendix B reviews the general EM method, and Chapter 6 shows how to apply it to the case of multiple subspaces.

