### 统计代写|回归分析作业代写Regression Analysis代考|Principal components analysis: A reconnaissance

## 统计代写|回归分析作业代写Regression Analysis代考|Principal components analysis: A reconnaissance

Eigendecomposition (i.e., rewriting a matrix in terms of its eigenvalues and eigenvectors) of a correlation matrix produces a set of eigenvalues that can be arranged in descending order, along with their corresponding eigenvectors. These eigenvectors are referred to as principals components (PCs) and have the intriguing property that the amount of variance they capture follows the same order as their magnitudes. In other words, the first eigenvector sum$\mathrm{~ m a n i z e s ~ t h e ~ m u s l ~ v a n i a n e , ~ t h e ~ s e c o n d ~ c i g c u v e l t e n ~ s u m u a r e s s ~ t h e ~ s e c u a}$ largest amount of varianec, and so forth. Because the first few cigenvectors often are sufficient for representing most of the information contained in their correlation matrix, principal components analysis (PCA) commonly is used for variable selection. However, here we use PCA to illustrate linkages between eigenvectors and SA.

Griffith (1984) began to explore the application of PCA to the binary SWM C, initially making little headway because the analysis was of $n$ indicator variable correlation (i.e., phi) coefficients. Later, a simple modification of matrix $\mathbf{C}$ illuminated a breakthrough: adding an identity matrix to $\mathbf{C}$ [i.e., $(\mathbf{C}+\mathbf{I})$ ] renders a SWM mimicking a correlation matrix, allowing the application of conventional PCA to it in a way that yields interpretable results (see Appendix 2.A). Furthermore, based on the mathematical properties of eigenvalues and eigenvectors, matrices $(\mathbf{C}+\mathbf{I})$ and $\mathbf{C}$ have the same eigenvectors and the same eigenvalues if one is added to the eigenvalues of matrix $\mathrm{C}$ (i.e., $\lambda_{j}+1$ ). The resulting PCA gives an important insight into the asymptotic relationship between the $\mathrm{SWM} \mathrm{C}$ eigenvectors and $\mathrm{SA}$ : the second $\mathrm{PC}$ corresponds to the maximum positive SA (PSA), and the last $\mathrm{PC}$ to the maximum negative SA (NSA). More specifically, PCA made possible the discovery of, and now makes obvious, the fundamental theorem of MESF, which pertains to matrix $\left(\mathrm{I}-11^{\mathrm{T}} / \mathrm{n}\right) \mathbf{C}\left(\mathbf{I}-11^{\mathrm{T}} / \mathrm{n}\right)$ .

## 统计代写|回归分析作业代写Regression Analysis代考|The spectral decomposition of a modified SWM

The MC can be written in matrix form as
The modified matrix $\left(\mathbf{I}-11^{\mathrm{T}} / \mathrm{n}\right) \mathrm{C}\left(\mathrm{I}-11^{\mathrm{T}} / \mathrm{n}\right)$ in its numerator is the matrix that relates to SA. Accordingly, the pair of eigenfunction problems becomes
$$\operatorname{Det}\left[\left(\mathbf{I}-11^{\mathrm{T}} / \mathrm{n}\right) \mathbf{C}\left(\mathbf{I}-11^{\mathrm{T}} / \mathrm{n}\right)-\lambda \mathbf{I}\right]=0$$
and
$$\left[\left(\mathbf{I}-11^{\mathrm{T}} / \mathrm{n}\right) \mathbf{C}\left(\mathbf{I}-11^{\mathrm{T}} / \mathrm{n}\right)-\lambda_{\mathrm{j}} \mathbf{I}\right] \mathbf{E}_{\mathrm{j}}=0, \mathrm{j}=1,2, \ldots, \mathrm{n}$$

subject to $\mathbf{E}{j}^{\mathrm{T}} \mathbf{E}{j}=1$ and $\mathbf{E}{j}^{\mathrm{T}} \mathbf{E}{\mathrm{k}}=0, j \neq k$. Because matrix $\left(\mathbf{I}-11^{\mathrm{T}} / \mathrm{n}\right) \times$ $\mathbf{C}\left(\mathbf{I}-11^{\mathrm{T}} / \mathrm{n}\right)$ is square and symmetric, it can be decomposed into a series of rank 1 matrices consisting of the products of the eigenvalues and eigenvectors; that is,
$$\left(\mathbf{I}-\frac{11^{\mathrm{T}}}{n}\right) \mathbf{C}\left(\mathbf{I}-\frac{11^{\mathrm{T}}}{n}\right)=\sum_{j=1}^{n} \lambda_{j} \mathbf{E}{j} \mathbf{E}{j}^{\mathrm{T}}$$
This right-hand summation expression is a matrix expansion of the standard eigendecomposition $\left(\mathbf{I}-11^{\mathrm{T}} / \mathrm{n}\right) \mathbf{C}\left(\mathbf{I}-11^{\mathrm{T}} / \mathrm{n}\right)=\mathbf{E} \Lambda \mathbf{E}^{\mathrm{T}}$, where $\Lambda$ is the diagonal matrix containing the n eigenvalues $\lambda_{\mathrm{j}}$

## 统计代写|回归分析作业代写Regression Analysis代考|Representing the MC with eigenfunctions

Let vector $\mathbf{Y}=\mathbf{E}{j}$. Next, substitute $\mathbf{E}{j}$ into Eq. (2.4):
$$\begin{gathered} {\left[\mathrm{n} /\left(\mathbf{1}^{\mathrm{T}} \mathbf{C} 1\right)\right] \mathbf{E}{\mathrm{j}}^{\mathrm{T}}\left(\mathbf{I}-11^{\mathrm{T}} / \mathrm{n}\right) \mathbf{C}(\mathbf{I}-\mathbf{1 1} / \mathrm{T}) \mathbf{E}{\mathrm{j}} /} \ {\left[\mathbf{E}{\mathrm{j}}^{\mathrm{T}}\left(\mathbf{I}-\mathbf{1 1} \mathbf{T}^{\mathrm{T}} / \mathrm{n}\right) \mathbf{E}{\mathrm{j}}\right]=\left[\mathrm{n} /\left(\mathbf{1}^{\mathrm{T}} \mathbf{C} 1\right)\right] \lambda_{\mathrm{j}} / 1} \end{gathered}$$
which is the same as Eq. (2.3). This outcome highlights why the extreme eigenvalues of matrix $\left(\mathrm{I}-11^{\mathrm{T}} / \mathrm{n}\right) \mathbf{C}\left(\mathbf{I}-11^{\mathrm{T}} / \mathrm{n}\right), \lambda_{1}$ and $\lambda_{\mathrm{m}}$, respectively, define the maximum and minimum values of a $\mathrm{MC}$ – they optimize its Rayleigh quotient. This feature of the MC distinguishes it from a Pearson product moment correlation coefficient: it does not have extreme values of $\pm 1$ but rather has an upper bound that often exceeds 1 by as much as $0.15$, or more, and a lower bound that often is closer to $-0.5$ than to $-1$.

Fig. 2.3A portrays the 2010 census tract surface partitioning $(n=1052)$ for the Dallas-Fort Worth (DFW) metroplex, and Fig. 2.3B is ArcMap SA output. The principal eigenvalue for this surface partitioning is $6.2675604$, based on the rook’s contiguity, with $1^{\mathrm{T}} \mathbf{C} 1=5626$. Therefore for PSA, $\mathrm{MC}=(1052 / 5626)(6.2675604)=1.171965$. This maximum PSA value exceeds 1 by $>0.17$. Meanwhile, for $\mathrm{NSA}, \quad \mathrm{MC}=(1052 / 5626) \times$ $(-3.949034)=-0.738426$. This maximum NSA value is closer to $-0.5$ than to $-1$.

