## 统计代写|时间序列分析代写Time-Series Analysis代考|S-650

## 统计代写|时间序列分析代写Time-Series Analysis代考|The Autocorrelation Function

The trouble with covariances is that they generally depend on the units with which $Y_t$ is measured. We can easily get around this problem by working correlations, which are just scaled versions of covariances. We have:
Definition 31 The correlation between $Y_t$ and $Y_{t-k}$ is:
$$\operatorname{Corr}\left[Y_t, Y_{t-k}\right]=\frac{\operatorname{Cov}\left[Y_t, Y_{t-k}\right]}{\operatorname{Var}\left[Y_t\right]^{\frac{1}{2}} \operatorname{Var}\left[Y_{t-k}\right]^{\frac{1}{2}}} .$$
Using stationarity we can simplify this considerably. Since
$$\operatorname{Cov}\left[Y_t, Y_{t-k}\right]=\gamma(k)$$
and by stationarity
$$\operatorname{Var}\left[Y_t\right]^{\frac{1}{2}}=\operatorname{Var}\left[Y_{t-k}\right]^{\frac{1}{2}}=\gamma(0)^{\frac{1}{2}}$$
we have:
$$\operatorname{Corr}\left[Y_t, Y_{t-k}\right]=\frac{\gamma(k)}{\gamma(0)}$$
With this in mind we can define the autocorrelation function
$$\rho(k)=\operatorname{Corr}\left[Y_t, Y_{t-k}\right]$$
as follows:
Definition 32 Autocorrelation Function: The autocorrelation function $\rho(k)$ is defined as:
$$\rho(k)=\frac{\gamma(k)}{\gamma(0)}$$

## 统计代写|时间序列分析代写Time-Series Analysis代考|The Autocorrelation Function of an AR(1) Process

We have :
Theorem 39 The autocorrelation function of a stationary AR(1) process is:
$$\rho(k)=\phi^{|k|} .$$
Proof. From Theorem 30 it follows that
\begin{aligned} \rho(k) & =\frac{\gamma(k)}{\gamma(0)} \ & =\frac{\phi^{|k|} \frac{\sigma^2}{1-\phi^2}}{\frac{\sigma^2}{1-\phi^2}} \ & =\phi^{|k|} . \end{aligned}

We plot $\rho(k)$ an $\operatorname{AR}(1)$ with $\phi=0.7$ below: ${ }^2$
$$\rho(k) \text { when } \phi=0.7$$
Since $|\phi|<1$ it follows that the autocorrelation function, like the autocovariance function, has the short-memory property so that $\rho(k)=O\left(\tau^k\right)$ as given in Section 2.2 with $A=1$ and $\tau=|\phi|$.

## 统计代写|时间序列分析代写Time-Series Analysis代考|STK9060

## 统计代写|时间序列分析代写Time-Series Analysis代考|The Autocovariance Function

An important implication of stationarity is that the covariance between the business cycle in say the first and third quarters of say 1999 is the same as the covariance between the business cycle the first and third quarters of say 1963 . In general covariances only depend on the number of periods separating $Y_t$ and $Y_s$ so that:

Theorem 20 If $Y_t$ is stationary then $\operatorname{Cov}\left[Y_{t_1}, Y_{t_2}\right]$ depends only on $k=t_1-t_2$; that is the number of periods separating $t_1$ and $t_2$.

Since we will often be focusing on covariances, and since $\operatorname{Cov}\left[Y_t, Y_{t-k}\right]$ only depends on $k$, let us define this as a function of $k$ as: $\gamma(k)$, which we will refer to as the autocovariance function so that:

Definition 21 Autocovariance Function: Let $Y_t$ be a stationary time series with $E\left[Y_t\right]=0$. The autocovariance function for $Y_t$, denoted as $\gamma(k)$, is defined for $k=0, \pm 1, \pm 2, \pm 3, \ldots \pm \infty$ as:
$$\gamma(k) \equiv E\left[Y_t Y_{t-k}\right]=\operatorname{Cov}\left[Y_t, Y_{t-k}\right] .$$
We have the following results for the autocovariance function:
Theorem $22 \gamma(0)=\operatorname{Var}\left[Y_t\right]>0$
Theorem $23 \gamma(k)=E\left[Y_t Y_{t-k}\right]=E\left[Y_s Y_{s-k}\right]$ for any $t$ and $s$.
Theorem $24 \gamma(-k)=\gamma(k)(\gamma(k)$ is an even function $)$

## 统计代写|时间序列分析代写Time-Series Analysis代考|The AR(1) Model

For the AR(1) model we have already shown that:
$$\gamma(0)=\operatorname{Var}\left[Y_t\right]=\frac{\sigma^2}{1-\phi^2}$$
and that for $k>0$ :
\begin{aligned} \gamma(k) & =\phi^k \gamma(0) \ & =\phi^k \frac{\sigma^2}{1-\phi^2} . \end{aligned}
We can make this formula correct for all $k$ by appealing to Theorem 24 and replacing $k$ with $|k|$ to obtain:

Theorem 30 For an $A R(1)$ process the autocovariance function is given by:
$$\gamma(k)=\frac{\phi^{|k|} \sigma^2}{1-\phi^2} .$$

## 统计代写|时间序列分析代写Time-Series Analysis代考|MGSC575

## 统计代写|时间序列分析代写Time-Series Analysis代考|The Short-Memory Property

Many of the models that we will be considering will have the property that they quickly forget or quickly become independent of what occurs either in the distant future or the distant past. This forgetting occurs at an exponential rate which represents a very rapid type of decay.

For example if you have a pie in the fridge and you eat one-half of the pie each day, you will quickly have almost no pie. After only ten days you would have:
$$\left(\frac{1}{2}\right)^{10}=\frac{1}{1024}$$
or about one-thousandth of a pie; maybe a couple of crumbs.
We will see that for stationary $\operatorname{ARMA}(\mathrm{p}, \mathrm{q})$ processes, the infinite moving average weights: $\psi_k$, the autocorrelation function $\rho(k)$ and the forecast function $E_t\left[Y_{t+k}\right]$, all functions of the number of periods $k$, all have the short-memory property which we now define:

Definition 12 Short-Memory: Let $P_k$ for $k=0,1,2, \ldots \infty$ be some numerical property of a stationary time series which depends on $k$, the number of periods. We say $P_k$ displays a short-memory or $P_k=O\left(\tau^k\right)$ if
$$\left|P_k\right| \leq A \tau^k$$
where $A \geq 0$ and $0<\tau<1$.

If $P_k=O\left(\tau^k\right)$ or if $P_k$ has a short-memory then $P_k$ decays rapidly in the same, manner that is at least as fast as $\tau^k$ decays to zero as $k \rightarrow \infty$. For example if:
$$P_k=10 \cos (2 k)\left(-\frac{1}{2}\right)^k$$
then $P_k$ decays rapidly in a manner which is bounded by exponential decay since $|\cos (2 k)| \leq 1$ and so we have:
$$\left|P_k\right| \leq 10\left(\frac{1}{2}\right)^k=A \tau^k$$
where $\tau=\frac{1}{2}$ and $A=10$. This is illustrated in the plot below:
$$P_k=10 \cos (2 k)\left(-\frac{1}{2}\right)^k$$
Not everything decays so rapidly. For example if we reverse the $\frac{1}{2}$ and the $k$ in $\left(\frac{1}{2}\right)^k$ we obtain:
$$Q_k=\frac{1}{k^{\frac{1}{2}}}=k^{-\frac{1}{2}} .$$

## 统计代写|时间序列分析代写Time-Series Analysis代考|The AR(1) Model

The simplest interesting model for $Y_t$ is a first-order autoregressive process or $\mathrm{AR}(1)$ which can be written as:
$$Y_t=\phi Y_{t-1}+a_t, a_t \sim \text { i.i.n }\left(0, \sigma^2\right),$$
where i.i.n. $\left(0, \sigma^2\right)$ means that $a_t$ is independently and identically distributed (i.i.d.) with a normal distribution with mean 0 and variance $\sigma^2$ so that the density of $a_t$ is:
$$p\left(a_t\right)=\frac{1}{\sqrt{2 \pi \sigma^2}} e^{-\frac{1}{2} a_t^2} .$$
We can attempt to calculate $E\left[Y_t\right]$ by taking expectations of both sides of (2.6) to obtain:
\begin{aligned} E\left[Y_t\right] & =\phi E\left[Y_{t-1}\right]+E\left[a_t\right] \ & =\phi E\left[Y_{t-1}\right] \end{aligned}
since $E\left[a_t\right]=0$. We now need to find $E\left[Y_{t-1}\right]$. We could try the same approach with $E\left[Y_{t-1}\right]$ since $Y_{t-1}=\phi Y_{t-2}+a_{t-1}$ from which we would conclude that: $E\left[Y_{t-1}\right]=\phi E\left[Y_{t-2}\right]$ so that:
$$E\left[Y_t\right]=\phi^2 E\left[Y_{t-2}\right] ;$$
but now we now need to find $E\left[Y_{t-2}\right]$. Clearly this process will never end.
If, however, we assume stationarity then it is possible to break this infinite regress since by the definition of stationarity in Definition 7:
$$E\left[Y_t\right]=E\left[Y_{t-1}\right] .$$
It then follows from (2.8) that:
$$E\left[Y_t\right]=\phi E\left[Y_t\right]$$
or
$$(1-\phi) E\left[Y_t\right]=0 .$$

## 统计代写|时间序列分析代写Time-Series Analysis代考|STAT6550

## 统计代写|时间序列分析代写Time-Series Analysis代考|Smoothing spline

The main disadvantage of the kernel smoothing method and the multitaper smoothing method is that they cannot guarantee that the final estimate is positive semidefinite while allowing flexible smoothing for each element of the spectral matrix. Thus, the same bandwidth is often applied to smoothing all the spectral components. However, in many applications, different components of the spectral matrix may need different smoothnesses, and require different smoothing parameters to get optimal estimates. To overcome this difficulty, Dai and Guo (2004) proposed a Cholesky decomposition based smoothing spline method for the spectrum estimation. The method models each Cholesky component separately by using different smoothing parameters. The method first obtains positive-definite and asymptotically unbiased initial spectral estimator $\widetilde{\mathbf{f}}_M(\omega)$ through sine multitapers as shown in Eq. (9.65). Then, it further smooths the Cholesky components of the spectral matrix via the smoothing spline and penalized sum of squares, which allows different degrees of smoothness for different Cholesky elements.

Suppose the spectral matrix $\tilde{\mathbf{f}}(\omega)$ has Cholesky decomposition such that $\tilde{\mathbf{f}}(\omega)=\mathbf{\Gamma} \mathbf{\Gamma}^*$, where $\boldsymbol{\Gamma}$ is $m \times m$ lower triangular matrix. To obtain unique decomposition, the diagonal elements of $\boldsymbol{\Gamma}$ are constrained to be positive. The diagonal elements $\gamma_{j, j}, j=1, \ldots, m$, the real part of $\gamma_{j, k}, \Re\left(\gamma_{j, k}\right)$, and imaginary part of $\gamma_{j, k}, \mathfrak{\Im}\left(\gamma_{j, k}\right), j>k$ are smoothed by spline with different smoothing parameters. Suppose $\gamma \in\left{\gamma_{j, j}, \Re\left(\gamma_{j, k}\right), \mathfrak{\Im}\left(\gamma_{j, k}\right), j>k\right.$, for $\left.j, k=1, \ldots, m\right}$, we have
$$\gamma\left(\omega_{\ell}\right)=a\left(\omega_{\ell}\right)+e\left(\omega_{\ell}\right),$$
where $a\left(\omega_{\ell}\right)=\mathrm{E}\left{\gamma\left(\omega_{\ell}\right)\right}$, and the $e\left(\omega_{\ell}\right), \ell=1, \ldots, n$, are independent errors with zero means and the variances depending on the frequency point $\omega_{\ell} \cdot a(\cdot)$ is periodic and is fitted by periodic smoothing spline (Wahba, 1990) of the form,
$$a(\omega)=c_0+\sum_{\nu=1}^{n / 2-1} c_\nu \sqrt{2} \cos (2 \pi \nu \omega)+\sum_{\nu=1}^{n / 2-1} d_\nu \sqrt{2} \sin (2 \pi \nu \omega)+c_{n / 2} \cos (\pi n \omega)$$

## 统计代写|时间序列分析代写Time-Series Analysis代考|Penalized Whittle likelihood

Penalized methods have been one of the most popular research topics for the past decade. In the area of spectrum estimation, Pawitan and O’Sullivan (1994) developed penalized likelihood method for the nonparametric estimation of the power spectrum of a univariate time series. Pawitan (1996) developed a penalized Whittle likelihood estimator for a bivariate time series. However, their approach cannot be easily generalized to higher dimension. More recently, Krafty and Collinge (2013) proposed a penalized Whittle likelihood method to estimate the power spectrum of a vector-valued time series. The method allows for varying levels of smoothness among spectral components while accounting for the positive definiteness of spectral matrices and the Hermitian and periodic structures of power spectra as functions of frequency. In this section, we briefly discuss the method by Krafty and Collinge (2013).
Recall that $\widetilde{\mathbf{Y}}{\ell}=\widetilde{\mathbf{Y}}\left(\omega{\ell}\right)$ are discrete Fourier transform of the time series, and define the negative log Whittle likelihood as
$$L(\mathbf{f})=\sum_{\ell=1}^L\left{\log \left|\mathbf{f}\left(\omega_{\ell}\right)\right|+\widetilde{\mathbf{Y}}{\ell}^\left[\mathbf{f}\left(\omega{\ell}\right)\right]^{-1} \widetilde{\mathbf{Y}}{\ell}\right},$$ where $\omega{\ell}=\ell / n$ and $L=[(n-1) / 2]$. Based on $L$, the method penalizes the roughness of the estimated spectrum. The spectral matrix $\mathbf{f}(\omega)$ is modeled by the Cholesky decomposition such that $\mathbf{f}(\omega)=\left[\mathbf{\Gamma} \boldsymbol{\Gamma}^\right]^{-1}$, where $\boldsymbol{\Gamma}$ is a $m \times m$ lower triangular matrix with real-valued diagonal elements. Further, denote $\boldsymbol{\Gamma}{i, j, R}(\omega ; \mathbf{f})$ and $\boldsymbol{\Gamma}{i, j, l}(\omega ; \mathbf{f})$ as the real and imaginary parts of the $(i, j)$ element of $\boldsymbol{\Gamma}$. The method proposes a measure of roughness of a power spectrum through the integrated squared $k$ th derivatives of the $m(m+1) / 2$ real and $m(m-1) / 2$ imaginary components of the Cholesky decomposition. Suppose the smoothing parameters, $\lambda=\left{\rho_{i, j}\right.$, $\left.\theta_{i, j}: i \leq j=1, \ldots, m\right}$, of $\boldsymbol{\Gamma}{i, j, R}$ and $\boldsymbol{\Gamma}{i, j, I}$ that control the roughness penalty are $\rho_{i, j}>0$ and $\theta_{i, j}$ $>0$, the roughness measure for a spectrum $\tilde{\mathbf{f}}(\omega)$ is
$$J_\lambda(\mathbf{f})=\sum_{i \leq j=1}^m \rho_{i, j} \int_0^{1 / 2}\left{\boldsymbol{\Gamma}{i, j, R}^{(k)}(\omega ; \mathbf{f})\right}^2 d \omega+\sum{i<j=1}^m \theta_{i, j} \int_0^{1 / 2}\left{\boldsymbol{\Gamma}{i, j, I}^{(k)}(\omega ; \mathbf{f})\right}^2 d \omega .$$ The interpretation is that the penalty function $J$ shrinks the estimates of power spectra toward real-valued matrix functions that are constant across frequency. Consequently, we consider minimizing the penalized Whittle negative loglikelihood $$Q\lambda(\mathbf{f})=L(\mathbf{f})+J_\lambda(\mathbf{f}) .$$

## 统计代写|时间序列分析代写Time-Series Analysis代考|Time-varying autoregressive model

## 统计代写|时间序列分析代写Time-Series Analysis代考|Time-varying autoregressive model

One of the methods used in Dahlhaus (2000) is the time-varying VARMA $(p, q)$ model. For a vector autoregressive model $\operatorname{VAR}(p)$, it is defined by
$$\mathbf{Z}t=\boldsymbol{\mu}(t / n)+\sum{j=1}^p \boldsymbol{\Phi}j(t / n)\left[\mathbf{Z}{t-j}-\boldsymbol{\mu}[(t-j)]+\boldsymbol{\Sigma}(t / n) \boldsymbol{\varepsilon}t, t=1, \ldots, n,\right.$$ where $\boldsymbol{\varepsilon}_t$ are $m$-dimensional independent random variable with mean zero and unit variance $\mathbf{I}$. In addition, we assume some smoothness conditions on $\boldsymbol{\Sigma}(\cdot)$ and $\boldsymbol{\Phi}_j(\cdot)$. In some neighborhood of a fixed time point $u_0=t_0 / n$, the process $\mathbf{Z}_t$ can be approximated by the stationary process $\mathbf{Z}_t\left(u_0\right)$ given by $$\mathbf{Z}_t\left(u_0\right)=\boldsymbol{\mu}\left(u_0\right)+\sum{j=1}^p \boldsymbol{\Phi}j\left(u_0\right)\left[\mathbf{Z}{t-j}\left(u_0\right)-\boldsymbol{\mu}\left(u_0\right)+\boldsymbol{\Sigma}\left(u_0\right) \boldsymbol{\varepsilon}_t, t=1, \ldots, n .\right.$$
$\mathbf{Z}_t$ has a unique time-varying power spectrum, which is locally the same as the power spectrum of $\mathbf{Z}_t(u)$, that is,
$$\mathbf{f}(u, \omega)=\left[\boldsymbol{\Phi}\left(u, e^{-i \omega}\right)\right]^{-1} \boldsymbol{\Sigma}(u)\left{\left[\boldsymbol{\Phi}\left(u, e^{-i \omega}\right)\right]^{-1}\right}^*,$$

where $u=t / n$, and
$$\boldsymbol{\Phi}(u, B)=\left(\mathbf{I}-\boldsymbol{\Phi}1(u, B)-\cdots-\boldsymbol{\Phi}_p\left(u, B^p\right)\right) .$$ Similarly, the locally covariance matrix is $$\boldsymbol{\Gamma}(u, j)=\int{-\pi}^\pi \mathbf{f}(u, \omega) \exp (j i \omega) d \omega .$$
Based on this statement, the time-varying spectrum can be obtained by estimating the timevarying parameters of the VAR model. For more properties of the estimation based on timevarying VARMA $(p, q), \operatorname{VAR}(p)$, and $\operatorname{VMA}(q)$ models, we refer readers to Dahlhaus (2000).

## 统计代写|时间序列分析代写Time-Series Analysis代考|Smoothing spline ANOVA model

Based on the locally stationary process, the time-varying spectrum can also be estimated nonparametrically via the smoothing spline Analysis of Variance (ANOVA) model by Guo and Dai (2006). However, their definition of locally stationary process is slightly different from Dahlhaus (2000). In Section 9.5.2, we mentioned that Dahlhaus (2000) assumes a series of transfer functions $\mathbf{A}^0(t / n, \omega)$ that converge to a large-sample transfer function $\mathbf{A}(u, \omega)$ in order to allow for the fitting of parametric models. Since Guo and Dai (2006) considered a nonparametric estimation, they used $\mathbf{A}(u, \omega)$ directly.

Definition 9.2 Without loss of generality, the $m$-dimensional zero-mean time series of length $n,\left{\mathbf{Z}t: t=1, \ldots, n\right}$, is called locally stationary if $$\mathbf{Z}_t=\int{-\pi}^\pi \mathbf{A}(t / n, \omega) \exp (i \omega t) d \mathbf{U}(\omega),$$
where we assume that the cumulants of $d \mathbf{U}(\omega)$ exists and are bounded for all orders. For the details, please see Brillinger (2002), and Guo and Dai (2006).

Based on this definition, the smoothing ANOVA model takes a two-stage estimation procedure. At the first stage, the locally stationary process is approximated by piecewise stationary time series with small blocks to obtain initial spectrum estimates and the Cholesky decomposition. The initial spectrum estimates are obtained by the multitaper method to reduce variance. At the second stage, each element of the Cholesky decomposition is treated as a bivariate smooth function of time and frequency and is modeled by the smoothing spline ANOVA model by Gu and Wahba (1993). The final estimated time-varying spectrum is reconstructed from the smoothed elements of the Cholesky decomposition. Thus, the method provides a way to ensure the final estimate of the multivariate spectrum is positive-definite while allowing enough flexibility in the smoothness of its elements.

We shall briefly discuss the smoothing spline ANOVA step. Suppose the spectral matrix $\widetilde{\mathbf{f}}(u, \omega)$ has the Cholesky decomposition such that $\widetilde{\mathbf{f}}(u, \omega)=\mathbf{L}(u, \omega) \mathbf{L}(u, \omega)^*$, where $\mathbf{L}(u, \omega)$ is a $m \times m$ lower triangular matrix. The method smooths the diagonal elements $\gamma_{j, j}(u, \omega), j=1, \ldots, m$, the real part of $\gamma_{j, k}(u, \omega), \mathfrak{R}\left{\gamma_{j, k}(u, \omega)\right}$, and the imaginary part of $\gamma_{j, k}(u, \omega), \mathfrak{J}\left{\gamma_{j, k}(u, \omega)\right}$, for $j>k$ separately with their own smoothing parameters. Let $\gamma(u, \omega) \in\left{\gamma_{j, j}(u, \omega), \mathfrak{R}\left(\gamma_{j, k}\right)(u, \omega), \mathfrak{\Im}\left(\gamma_{j, k}\right)\right.$ $(u, \omega), j>k$, for $j, k=1, \ldots, m}$. We have
$$\gamma(u, \omega)=a(u, \omega)+\varepsilon(u, \omega),$$where $a(u, \omega)$ is the corresponding Cholesky decomposition element of the spectrum, such that $a(u, \omega)=E{\gamma(u, \omega)}$, the $\varepsilon(u, \omega)$ are independent errors with zero-mean and the variance depending on the time-frequency point $(u, \omega)$.

