### 统计代写|贝叶斯统计代写Bayesian statistics代考|Space-time covariance functions

statistics-lab™ 为您的留学生涯保驾护航 在代写贝叶斯统计方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写贝叶斯统计代写方面经验极为丰富，各种代写贝叶斯统计相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• Advanced Probability Theory 高等概率论
• Advanced Mathematical Statistics 高等数理统计学
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 统计代写|贝叶斯统计代写beyesian statistics代考|Space-time covariance functions

A particular covariance structure must be assumed for the $Y\left(\mathbf{s}{i}, t\right)$ process. The pivotal space-time covariance function is defined as $$C\left(\mathbf{s}{1}, \mathbf{s}{2} ; t{1}, t_{2}\right)=\operatorname{Cov}\left[Y\left(\mathbf{s}{1}, t{1}\right), Y\left(\mathbf{s}{2}, t{2}\right)\right] .$$
The zero mean spatio-temporal process $Y(s, t)$ is said to be covariance stationary if
$$C\left(\mathbf{s}{1}, \mathbf{s}{2} ; t_{1}, t_{2}\right)=C\left(\mathbf{s}{1}-\mathbf{s}{2} ; t_{1}-t_{2}\right)=C(d ; \tau)$$

where $d=\mathbf{s}{1}-\mathbf{s}{2}$ and $\tau=t_{1}-t_{2}$. The process is said to be isotropic if
$$C(d ; \tau)=C(|d| ;|\tau|),$$
that is, the covariance function depends upon the separation vectors only through their lengths ||$d||$ and $|\tau|$. Processes which are not isotropic are called anisotropic. In the literature isotropic processes are popular because of their simplicity and interpretability. Moreover, there is a number of simple parametric forms available to model those.

A further simplifying assumption to make is the assumption of separability; see for example, Mardia and Goodall (1993). Separability is a concept used in modeling multivariate spatial data including spatio-temporal data. A separable covariance function in space and time is simply the product of two covariance functions one for space and the other for time.
The process $Y(\mathbf{s}, t)$ is said to be separable if
$$\left.C(|d| ;|\tau|)=C_{s}(|d|)\right) C_{t}(|\tau|) .$$
Now suitable forms for the functions $C_{s}(\cdot)$ and $C_{t}(\cdot)$ are to be assumed. A very general choice is to adopt the Matèrn covariance function introduced before.
There is a growing literature on methods for constructing non-separable and non-stationary spatio-temporal covariance functions that are useful for modeling. See for example, Gneiting (2002) who develops a class of nonseparable covariance functions. A simple example is:
$$C(|d| ;|\tau|)=(1+|\tau|)^{-1} \exp \left{-| d|| /(1+|\tau|)^{\beta / 2}\right},$$
where $\beta \in[0,1]$ is a space-time interaction parameter. For $\beta=0,(2.3)$ provides a separable covariance function. The other extreme case at $\beta=1$ corresponds to a totally non-separable covariance function. Figure $2.3$ plots this function for four different values: $0,0.25,0.5$ and 1 of $\beta$. There are some discernible differences between the functions can be seen for higher distances at the top right corner of each plot. However, it is true that it is not easy to describe the differences, and it gets even harder to see differences in model fits. The paper by Gneiting (2002) provides further descriptions of non-separable covariance functions.

There are other ways to construct non-separable covariance functions, for example, by mixing more than one spatio-temporal processes, see e.g. Sahu et al. (2006) or by including a further level of hierarchy where the covariance matrix obtained using $C(|d| ;|\tau|)$ follows a inverse-Wishart distribution centred around a separable covariance matrix. Section $8.3$ of the book by Banerjee et al. (2015) also lists many more strategies. For example, Schmidt and O’Hagan (2003) construct non-stationary spatio-temporal covariance structure via deformations.

## 统计代写|贝叶斯统计代写beyesian statistics代考|Kriging or optimal spatial prediction

The jargon “Kriging” refers to a form of spatial prediction at an unobserved location based on the observed data. That it is a popular method is borne by the fact that “Kriging” is verbification of a method of spatial prediction named after its inventor D.G. Krige, a South African mining engineer. Kriging solves the problem of predicting $Y(\mathbf{s})$ at a new location $\mathbf{s}{0}$ having observed data $y\left(\mathbf{s}{1}\right), \ldots, y\left(\mathbf{s}_{n}\right)$.

Classical statistical theory based on squared error loss function in prediction will yield the sample mean $\bar{y}$ to be the optimal predictor for $Y\left(\mathbf{s}{0}\right)$ if spatial dependency is ignored between the random variables $Y\left(\mathbf{s}{0}\right), Y\left(\mathbf{s}{1}\right), \ldots, Y\left(\mathbf{s}{n}\right)$. Surely, because of the Tobler’s law, the prediction for $Y\left(\mathbf{s}{0}\right)$ will be improved if instead spatial dependency is taken into account. The observations nearer to the prediction location, $\mathbf{s}{0}$, will receive higher weights in the prediction formula than the observations further apart. So, now the question is how do we determine these weights? Kriging provides the answer.
In order to proceed further we assume that $Y(\mathbf{s})$ is a GP, although some of the results we discuss below also hold in general without this assumption. In order to perform Kriging it is assumed that the best linear unbiased predictor with weights $l_{i}, \hat{Y}\left(\mathrm{~s}{0}\right)$, is of the form $\sum{i=1}^{n} \ell_{i} Y\left(\mathbf{s}{i}\right)$ and dependence between the $Y\left(\mathbf{s}{0}\right), Y\left(\mathbf{s}{1}\right), \ldots, Y\left(\mathbf{s}{n}\right)$ is described by a covariance function, $C(d \mid \psi)$, of the distance $d$ between any two locations as defined above in this chapter. The Kriging weights are easily determined by evaluating the conditional mean of $Y\left(\mathbf{s}{0}\right)$ given the observed values $y\left(\mathbf{s}{1}\right), \ldots, y\left(\mathbf{s}{n}\right)$. These weights are “optimal” in the same statistical sense that the mean $E(X)$ minimizes the expected value of the squared error loss function, i.e. $E(X-a)^{2}$ is minimized at $a=$ $E(X)$. Here we take $X$ to be the conditional random variable $Y\left(\mathrm{~s}{0}\right)$ given $y\left(\mathbf{s}{1}\right), \ldots, y\left(\mathbf{s}{n}\right) .$

The actual values of the optimal weights are derived by partitioning the mean vector, $\boldsymbol{\mu}{n+1}$ and the covariance matrix, $\Sigma$, of $Y\left(\mathbf{s}{0}\right), Y\left(\mathbf{s}{1}\right), \ldots, Y\left(\mathbf{s}{n}\right)$ as follows. Let
$$\boldsymbol{\mu}{n+1}=\left(\begin{array}{c} \mu{0} \ \boldsymbol{\mu} \end{array}\right), \quad \Sigma=\left(\begin{array}{cc} \sigma_{00} & \Sigma_{01} \ \Sigma_{10} & \Sigma_{11} \end{array}\right)$$
where $\boldsymbol{\mu}$ is the vector of the means of $\mathbf{Y}=\left(Y\left(\mathbf{s}{1}\right), \ldots, Y\left(\mathbf{s}{n}\right)\right)^{\prime} ; \sigma_{00}=$ $\operatorname{Var}\left(Y\left(\mathbf{s}{0}\right)\right) ; \Sigma{01}=\Sigma_{1,0}^{\prime}=\operatorname{Cov}\left(\begin{array}{c}Y\left(\mathbf{s}{0}\right) \ \mathbf{Y}{1}\end{array}\right) ; \Sigma_{11}=\operatorname{Var}\left(\mathbf{Y}{1}\right)$. Now standard multivariate normal distribution theory tells us that $$Y\left(\mathbf{s}{0}\right) \mid \mathbf{y} \sim N\left(\mu_{0}+\Sigma_{01} \Sigma_{11}^{-1}(\mathbf{y}-\boldsymbol{\mu}), \sigma_{00}-\Sigma_{01} \Sigma_{11}^{-1} \Sigma_{10}\right) .$$
In order to facilitate a clear understanding of the underlying spatial dependence on Kriging we assume a zero-mean GP, i.e. $\boldsymbol{\mu}{\mathrm{n}+1}=\mathbf{0}$. Now we have $E\left(Y\left(\mathbf{s}{0}\right) \mid \mathbf{y}\right)=\Sigma_{01} \Sigma_{11}^{-1} \mathbf{y}$ and thus we see that the optimal Kriging weights are particular functions of the assumed covariance function. Note that the weights

do not depend on the underlying common spatial variance as that is canceled in the product $\Sigma_{01} \Sigma_{11}^{-1}$. However, the spatial variance will affect the accuracy of the predictor since $\operatorname{Var}\left(Y\left(\mathbf{s}{0}\right) \mid \mathbf{y}\right)=\sigma{00}-\Sigma_{01} \Sigma_{11}^{-1} \Sigma_{10}$.

It is interesting to note that Kriging is an exact predictor in the sense that $E\left(Y\left(\mathbf{s}{i}\right) \mid \mathbf{y}\right)=y\left(\mathbf{s}{i}\right)$ for any $i=1, \ldots, n$. It is intuitively clear why this result will hold. This is because a random variable is an exact predictor of itself. Mathematically, this can be easily proved using the definition of inverse of an matrix. To elaborate further, suppose that
$$\Sigma=\left(\begin{array}{c} \Sigma_{1}^{\prime} \ \Sigma_{2}^{\prime} \ \vdots \ \Sigma_{n}^{\prime} \end{array}\right)$$
where $\Sigma_{i}^{\prime}$ is a row vector of dimension $n$. Then the result $\Sigma \Sigma^{-1}=I_{n}$ where $I_{n}$ is the identity matrix of order $n$, implies that $\Sigma_{i} \Sigma^{-1}=\mathbf{a}{i}$ where the $i$ th element of $\mathbf{a}{i}$ is 1 and all others are zero.

The above discussion, with the simplified assumption of a zero mean GP, is justified since often in practical applications we only assume a zero-mean GP as a prior distribution. The mean surface of the data (or their transformations) is often explicitly modeled by a regression model and hence such models will contribute to determine the mean values of the predictions. In this context we note that in a non-Bayesian geostatistical modeling setup there are various flavors of Kriging such as simple Kriging, ordinary Kriging, universal Kriging, co-Kriging, intrinsic Kriging, depending on the particular assumption of the mean function. In our Bayesian inference set up such flavors of Kriging will automatically ensue since Bayesian inference methods are automatically conditioned on observed data and the explicit model assumptions.

## 统计代写|贝叶斯统计代写beyesian statistics代考|Autocorrelation and partial autocorrelation

A study of time series for temporally correlated data will not be complete without the knowledge of autocorrelation. Simply put, autocorrelation means correlation with itself at different time intervals. The time interval is technically called the lag in the time series literature. For example, suppose $Y_{t}$ is a time series random variable where $t \geq 1$ is an integer. The autocorrelation at lag $k(\geq 1)$ is defined as $\rho_{k}=\operatorname{Cor}\left(Y_{t+k}, Y_{t}\right)$. It is obvious that the autocorrelation at lag $k=0, \rho_{0}$, is one. Ordinarily, $\rho_{k}$ decreases as $k$ increases just as the spatial correlation decreases when the distance between two locations increases. Viewed as a function of the lag $k, \rho_{k}$ is called the autocorrelation function, often abbreviated as ACF.

Sometimes high autocorrelation at any lag $k>1$ persists because of high correlation between $Y_{t+k}$ and the intermediate time series, $Y_{t+k-1}, \ldots, Y_{t+1}$. The partial autocorrelation at lag $k$ measures the correlation between $Y_{t+k}$ and $Y_{t}$ after removing the autocorrelation at shorter lags. Formally, partial autocorrelation is defined as the conditional autocorrelation between $Y_{t+k}$ and $Y_{t}$ given the values of $Y_{t+k-1}, \ldots, Y_{t+1}$. The partial correlation can also be easily explained with the help of multiple regression. To remove the effects of intermediate time series $Y_{t+k-1}, \ldots, Y_{t+1}$ one considers two regression models: one $Y_{t+k}$ on $Y_{t+k-1}, \ldots, Y_{t+1}$ and the other $Y_{t}$ on $Y_{t+k-1}, \ldots, Y_{t+1}$. The simple correlation coefficient between two sets of residuals after fitting the two regression models is the partial auto-correlation at a given lag $k$. To learn more the interested reader is referred to many excellent introductory text books on time series such as the one by Chatfield (2003).

## 统计代写|贝叶斯统计代写beyesian statistics代考|Space-time covariance functions

C(s1,s2;吨1,吨2)=C(s1−s2;吨1−吨2)=C(d;τ)

C(d;τ)=C(|d|;|τ|),

C(|d|;|τ|)=Cs(|d|))C吨(|τ|).

C(|d| ;|\tau|)=(1+|\tau|)^{-1} \exp \left{-| d|| /(1+|\tau|)^{\beta / 2}\right},C(|d| ;|\tau|)=(1+|\tau|)^{-1} \exp \left{-| d|| /(1+|\tau|)^{\beta / 2}\right},

## 统计代写|贝叶斯统计代写beyesian statistics代考|Kriging or optimal spatial prediction

μn+1=(μ0 μ),Σ=(σ00Σ01 Σ10Σ11)

Σ=(Σ1′ Σ2′ ⋮ Σn′)

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。