### 统计代写|回归分析作业代写Regression Analysis代考|Summary

This chapter presents findings supporting the motivating contention that recognizing and efficiently and effectively accounting for and handling SA matters in georeferenced data analyses. Its discussion begins with a treatment of various subtly different definitional perspectives about $\mathrm{SA}$, emphasizing this concept in terms of map pattern. A mathematical formularization of Tobler’s First Law of Geography follows defining SA. A critical element of quantifying SA is specification of a SWM, whose two most popular forms are denoted by matrix $\mathbf{C}$, defined in terms of $\mathrm{n}$ binary zero-one indicators based upon the topological structure of a set of areal units, and matrix $\mathbf{W}$, the row-standardized version of matrix C. A SWM enables the positing of different indices to quantify SA that are specific to different data measurement scale types. Of these indices, the most popular is the $\mathrm{MC}$, which was developed for interval/ratio data measurement scales but has been extended to nominal and ordinal data measurement scales. Accordingly, this chapter presents the basics of the $\mathrm{MC}$ sampling distribution theory, which is essential for statistical inference purposes. The next section (Section 1.2) addresses impacts of SA on statistical analyses, stressing various ways it affects variance. Then discussion turns to SA contextualized as a violation of the IID assumption of classical statistics, and how it can be visualized with maps and with a Moran scatterplot, as well as with other graphical tools. Although the SA literature focuses on its impact upon variance, the last substantive/pedagogic background section furnishes illustrations showing how SA more generally can modify the shape of histograms.

## 统计代写|回归分析作业代写Regression Analysis代考|The mean and variance of the MC for linear

Following Clift and Ord (1973), Upton and Fingleton (1985, p. 338) show that the mean and variance of the $\mathrm{MC}$ for linear regression residuals, for
which the $S W M C$ is symmetric and the number of covariates is $\mathrm{P}$, are as follows, where $\mathbf{M}=\mathbf{I}-\mathbf{X}\left(\mathbf{X}^{\mathrm{T}} \mathbf{X}\right)^{-1} \mathbf{X}^{\mathrm{T}}$ is the standard projection matrix from linear regression theory:
$$\begin{gathered} E(M C)=\frac{n}{\mathbf{1}^{T} \mathbf{C 1}} \frac{\operatorname{TR}\left{\left[\mathbf{I}-\mathbf{X}\left(\mathbf{X}^{\mathrm{T}} \mathbf{X}\right)^{-1} \mathbf{X}^{\mathrm{T}}\right] \mathbf{C}\right}}{\mathrm{n}-\mathrm{k}}, \text { and } \ \operatorname{VAR}(\mathrm{MC})=2\left(\frac{\mathrm{n}}{\mathbf{1}^{\mathrm{T}} \mathbf{C} 1}\right)^{2}\left[\frac{\operatorname{TR}\left{\left[\mathbf{I}-\mathbf{X}\left(\mathbf{X}^{\mathrm{T}} \mathbf{X}\right)^{-1} \mathbf{X}^{\mathrm{T}}\right] \mathbf{C}\left[\mathbf{I}-\mathbf{X}\left(\mathbf{X}^{\mathrm{T}} \mathbf{X}\right)^{-1} \mathbf{X}^{\mathrm{T}}\right] \mathbf{C}\right}}{(\mathrm{n}-\mathrm{k})(\mathrm{n}-\mathrm{k}+2)}\right. \ \left.-\frac{\left.\left(\operatorname{TR}\left{\left[\mathbf{I}-\mathbf{X}\left(\mathbf{X}^{\mathrm{T}} \mathbf{X}\right)^{-1} \mathbf{X}^{\mathrm{T}}\right] \mathbf{C}\right}\right)^{2}\right]}{(\mathrm{n}-\mathrm{k})^{2}}\right] \end{gathered}$$
where $\mathrm{k}=\mathrm{p}+1$.
For the following cases, suppose $\mathbf{X}{j}$ is proportional to the principal eigenvector of matrix $\left(\mathbf{I}-11^{\mathrm{T}} / \mathrm{n}\right) \mathrm{C}(\mathbf{I}-\mathbf{1 1} / \mathrm{T})$, whose eigenvalue is $\lambda{1}$, which results in the maximum degree of PSA (a worse-case scenario). The maximum principal eigenvalue for a planar surface partitioning is such that $\lambda_{1} \leq \mathrm{a}+\sqrt{2 \mathrm{n}-\mathrm{b}}$, for appropriately calculated real numbers a and $\mathrm{b}$ (Griffith \& Sone, 1995 , p. 170). However, most empirical surface partitionings relate more to a mixture of a regular square and a regular hexagonal tessellation. In addition, most empirical surfaces do not have any areal units whose number of neighbors is a function of n; rather, areal units with rook adjacency-defined neighbors of 30 or more are rare. Accordingly, their principal eigenvalues are much closer to 6 and not a function of $\mathrm{n}$.

Convergence on the asymptotic variance presented here tends to be slower as the number of covariates increases. Nevertheless, it is a good

approximation for cases where $n-k$ is of a practical magnitude (e.g., at least 100$)$.

Table 1.A.1 summarizes results for a numerical example in which the number of areal units and the geographic configuration vary. One modification indicated by Case II is that multiple covariates that are correlated are replaced by their principal components, in a dimension-reducing way, which decreases the value of $k$ here in the denominator (an argument could be made not to decrease $k$ ). Furthermore, the constrained maximum hexagonal tessellation has areal units labeled 1 and $\mathrm{n}$ that wrap around the outside of a complete rectangular region, but constrained so that their maximum numbers of neighbors are 30 . Results in this table corroborate the contention that the asymptotic standard error of the $\mathrm{MC}$ for regression residuals is a very good approximation as long as the principal eigenvalue of matrix $C$ is not a function of $n$, no areal units have numbers of neighbors that are a function of $n$, and the number of covariates is not a function of $n$. Griffith (2010) demonstrates this same result for the case of no covariates. One important index here is $n-k$, which needs to be relatively large for the asymptotic standard error to be a good approximation. The smallest value in Table 1.A.1 for this expression is 97 , for which the approximation is quite good. Results for $\mathrm{n}<50$ tend to be poor in the linear regression context, whereas results for $\mathrm{n}<25$ tend to be poor in the univariate context (i.e., no covariates).

The expected value of the $\mathrm{MC}$ is easier to calculate than its standard error; this quantity goes to zero as $\mathrm{n}$ goes to infinity.

