### 金融代写|金融计量经济学Financial Econometrics代考|The ‘Rigorous’ or ‘Plug-in’ Lasso

statistics-lab™ 为您的留学生涯保驾护航 在代写金融计量经济学Financial Econometrics方面已经树立了自己的口碑, 保证靠谱, 高质且原创的统计Statistics代写服务。我们的专家在代写金融计量经济学Financial Econometrics代写方面经验极为丰富，各种代写金融计量经济学Financial Econometrics相关的作业也就用不着说。

• Statistical Inference 统计推断
• Statistical Computing 统计计算
• (Generalized) Linear Models 广义线性模型
• Statistical Machine Learning 统计机器学习
• Longitudinal Data Analysis 纵向数据分析
• Foundations of Data Science 数据科学基础

## 金融代写|金融计量经济学Financial Econometrics代考|The ‘regularisation event’

Bickel et al. (2009) present a theoretically-derived penalisation method for the lasso. Belloni, Chernozhukov, Hansen, and coauthors in a series of papers (e.g., Belloni et al. (2011), Belloni and Chernozhukov (2013), Belloni et al. (2016), Chernozhukov et al. (2015) and, most recently, Chernozhukov et al. (2019)) proposed feasible algorithms for implementing the ‘rigorous’ or ‘plug-in’ lasso and extended it to accommodate heteroskedasticity, non-Gaussian disturbances, and clustered data. The estimator has two attractive features for our purposes. First, it is theoretically and intuitively appealing, with well-established properties. Second, it is computationally attractive compared to cross-validation. We first present the main results for the rigorous lasso for the independent data, and then briefly summarise the ‘cluster-lasso’ of Belloni et al. (2016) before turning to the more general time-series setting analysed in Chernozhukov et al. (2019).

The rigorous lasso is consistent in terms of prediction and parameter estimation under three main conditions:

• Sparsity
• Restricted sparse eigenvalue condition
• The ‘regularisation event’.
We consider each of these in turn.
Exact sparsity we have already discussed: there is a large set of potentially relevant predictors, but the true model has only a small number of regressors. Exact sparsity is a strong assumption, and in fact it is stronger than is needed for the rigorous lasso.

Instead, we assume approximate sparsity. Intuitively, some true coefficients may be non-zero but small enough in absolute size that the lasso performs well even if the corresponding predictors are not selected.
Belloni et al. (2012) define the approximate sparse model (ASM),
$$y_{i}=f\left(\boldsymbol{w}{i}\right)+\varepsilon{i}=\boldsymbol{x}{i}^{\prime} \boldsymbol{\beta}{0}+r_{i}+\varepsilon_{i} .$$
where $\varepsilon_{i}$ are independently distributed, but possibly heteroskedastic and nonGaussian errors. The elementary predictors $w_{i}$ are linked to the dependent variable through the unknown and possibly non-linear function $f(\cdot)$. The objective is to approximate $f\left(w_{i}\right)$ using the target parameter vector $\boldsymbol{\beta}{0}$ and the transformations $\boldsymbol{x}{i}:=P\left(\boldsymbol{w}{i}\right)$, where $P(\cdot)$ is a set of transformations. The vector of predictors $\boldsymbol{x}{i}$ may be large relative to the sample size. In particular, the setup accommodates the case where a large number of transformations (polynomials, dummies, etc.) approximate $f\left(w_{i}\right)$.

Approximate sparsity requires that $f\left(w_{i}\right)$ can be approximated sufficiently well using only a small number of non-zero coefficients. Specifically, the target vector $\boldsymbol{\beta}{0}$ and the sparsity index $s$ need to satisfy $$\left|\boldsymbol{\beta}{0}\right|_{0}:=s \ll n \quad \text { with } \frac{s^{2} \log ^{2}(p \vee n)}{n} \rightarrow 0$$
and the resulting approximation error $r_{i}=f\left(w_{i}\right)-\boldsymbol{x}{i}^{\prime} \boldsymbol{\beta}{0}$ satisfied the bound
$$\sqrt{\frac{1}{n} \sum_{i=1}^{n} r_{i}^{2}} \leq C \sqrt{\frac{s}{n}}$$
where $C$ is a positive constant.
For example, consider the case where $f\left(w_{i}\right)$ is linear with $f\left(\boldsymbol{w}{i}\right)=\boldsymbol{x}{i}^{\prime} \boldsymbol{\beta}^{*}$, but the true parameter vector $\boldsymbol{\beta}^{\star}$ is high-dimensional: $\left|\boldsymbol{\beta}^{\star}\right|_{0}>n$. Approximate sparsity means we can still approximate $\beta^{\star}$ using the sparse target vector $\boldsymbol{\beta}{0}$ as long as $r{i}=\boldsymbol{x}{i}^{\prime}\left(\boldsymbol{\beta}^{\star}-\boldsymbol{\beta}{0}\right)$ is sufficiently small as specified in (13).

## 金融代写|金融计量经济学Financial Econometrics代考|Implementing the Rigorous Lasso

The quantile function $q_{\Lambda}(\cdot)$ for the maximal element of the score vector is unknown. The most common approach to addressing this is to use a theoretically-derived upper bound that guarantees that the regularisation event (14) holds asymptotically. ${ }^{9}$ Specifically, Belloni et al. (2012) show that

A Theory-Based Lasso for Time-Series Data
17
$$\mathrm{P}\left(\max {1 \leq j \leq p} c\left|S{j}\right| \leq \frac{\lambda \psi_{j}}{n}\right) \rightarrow 1 \text { as } n \rightarrow \infty, \gamma \rightarrow 0$$
$$\lambda=2 c \sqrt{n} \Phi^{-1}(1-\gamma /(2 p)) \quad \psi_{j}=\sqrt{\frac{1}{n} \sum_{i} x_{i j}^{2} \varepsilon_{i}^{2}}$$
$c$ is the slack parameter from above and $\gamma \rightarrow 0$ means the probability of the regularisation event converges towards 1. Common settings for $c$ and $\gamma$, based on Monte Carlo studies are $c=1.1$ and $\gamma=0.1 / \log (n)$, respectively.

The only remaining element is estimation of the ideal penalty loadings $\psi_{j}$. Belloni et al. $(2012,2014)$ recommend an iterative procedure based on some initial set of residuals $\hat{\varepsilon}{0, i}$. One choice is to use the $d$ predictors that have the highest correlation with $y{i}$ and regress $y_{i}$ on these using OLS; $d=5$ is their suggestion. The residuals from this OLS regression can be used to obtain an initial set of penalty loadings $\hat{\psi}_{j}$ according to $(20)$. These initial penalty loadings and the penalty level from (20) are used to obtain the lasso or post-lasso estimator $\hat{\boldsymbol{\beta}}$. This estimator is then used to obtain a updated set of residuals and penalty loadings according to (20), and then an updated lasso estimator. The procedure can be iterated further if desired.

The framework set out above requires only independence across observations; heteroskedasticity, a common issue facing empirical researchers, is automatically accommodated. For this reason we refer to it as the ‘heteroskedastic-consistent rigorous lasso’ or HC-lasso. The reason is that heteroskedasticity is captured in the penalty loadings for the score vector. ${ }^{10}$ Intuitively, heteroskedasticity affects the probability that the term $\max {j}\left|\sum{i} x_{i j} \varepsilon_{i}\right|$ takes on extreme values, and this needs to be captured via the penalty loadings. In the special case of homoskedasticity, the ideal penalisation in $(20)$ simplifies:
$$\lambda=2 c \sigma \sqrt{n} \Phi^{-1}(1-\gamma /(2 p)), \quad \psi_{j}=1$$
This follows from the fact that we have standardised the predictors to have unit variance and hence homoskedasticity implies $E\left(x_{i j}^{2} \varepsilon_{i}^{2}\right)=\sigma^{2} E\left(x_{i j}^{2}\right)=\sigma^{2}$. The iterative procedure above is used to obtain residuals to form an estimate $\hat{\sigma}^{2}$ of the error variance $\sigma^{2}$. We refer to the rigorous lasso with this simplification as the ‘standard’ or ‘basic’ rigorous lasso.

## 金融代写|金融计量经济学Financial Econometrics代考|The Rigorous Lasso for Panel Data

‘The rigorous lasso has been extended to cover a special case of dependent data, namely panel data. The ‘cluster-lasso’ proposed by Belloni et al. (2016) allows

for arbitrary within-panel correlation. The theoretical justification for the clusterlasso also supports the use of the rigorous lasso in a pure time series setting, and specifically the HAC-lasso and AC-lasso proposed in this paper. Belloni et al. (2016) prove consistency of the rigorous cluster-lasso for both the large $n$, fixed $T$ and large $n$, large $T$ settings. The large $n$-fixed $T$ results apply also to the specific forms of the HAC-lasso and AC-lasso proposed here. We first outline the Belloni et al. (2016) cluster-lasso and then our proposed estimators.

Belloni et al. (2016) present the approach in the context of a fixed-effects panel data model with balanced panels, but the fixed effects and balanced structure are not essential and the approach applies to any setups with clustered data. For presentation purposes we simplify and write the model as a balanced panel:
$$y_{i t}=\boldsymbol{x}{i t}^{\prime} \boldsymbol{\beta}+\varepsilon{i t} \quad i=1, \ldots, n, t=1, \ldots, T$$
The general intuition behind the rigorous lasso is to control the noise in the score vector $\boldsymbol{S}=\left(S_{1}, \ldots, S_{j}, \ldots, S_{p}\right)$ where $S_{j}=\frac{2}{n} \sum_{i=1}^{n} x_{i j} \varepsilon_{i}$. Specifically, we choose the overall penalty $\lambda$ and the predictor-specific penalty loading $\psi_{j}$ so that $\frac{\lambda}{n}$ exceeds the maximal element of the scaled score vector $\left|\psi_{j}^{-1} S_{j}\right|$ with high probability. In effect, the ideal penalty loading $\psi_{j}$ scales the $j$ th element of the score by its standard deviation. In the benchmark heteroskedastic case, the ideal penalty loading is $\psi_{j}=$ $\int_{\sigma} \sqrt{\frac{1}{n} \sum_{i} x_{i j}^{2} \varepsilon_{i}^{2}}$; under homoskedasticity, the ideal penalty loading is simply $\psi_{j}=$
$\forall j$ and hence can be absorbed into the overall penalty $\lambda$.
The cluster-lasso of Belloni et al. (2016) extends this to accommodate the case where the score is independent across but not within panels $i$. In this case, the ideal penalty loading is just an application of the standard cluster-robust covariance estimator, which provides a consistent estimate of the variance of the $j$ th element of the score vector. The ideal penalty loadings for the cluster-lasso are simply
$$\psi_{j}=\sqrt{\frac{1}{n T} \sum_{i=1}^{n} u_{i j}^{2}} \quad \text { where } u_{i j}:=\sum_{t} x_{i j t} \varepsilon_{i t}$$
Belloni et al. (2016) show that this ideal penalty can be implemented in the same way as the previous cases, i.e., by using an initial set of residuals and then iterating. They recommend that the overall penalty level is the same as in the heteroskedastic case, $\lambda=2 c \sqrt{n} \Phi^{-1}(1-\gamma /(2 p))$, except that $\gamma$ is $0.1 / \log (n)$, i.e., it uses the number of clusters rather than the number of observations.

## 金融代写|金融计量经济学Financial Econometrics代考|The ‘regularisation event’

• 稀疏性
• 受限稀疏特征值条件
• “正规化事件”。
我们依次考虑这些。
我们已经讨论过精确的稀疏性：有大量潜在相关的预测变量，但真实模型只有少量回归变量。精确稀疏性是一个强有力的假设，实际上它比严格套索所需的要强。

|b0|0:=s≪n 和 s2日志2⁡(p∨n)n→0

1n∑一世=1nr一世2≤Csn

## 金融代写|金融计量经济学Financial Econometrics代考|Implementing the Rigorous Lasso

17

λ=2Cn披−1(1−C/(2p))ψj=1n∑一世X一世j2e一世2
C是上面的松弛参数和C→0表示正则化事件的概率收敛于 1。常见设置为C和C，基于蒙特卡罗研究是C=1.1和C=0.1/日志⁡(n)， 分别。

λ=2Cσn披−1(1−C/(2p)),ψj=1

## 金融代写|金融计量经济学Financial Econometrics代考|The Rigorous Lasso for Panel Data

‘严格套索已经扩展到涵盖依赖数据的特殊情况，即面板数据。Belloni 等人提出的“集群套索”。（2016）允许

∀j因此可以被吸收到整体惩罚中λ.
Belloni 等人的集群套索。(2016) 对此进行了扩展，以适应分数独立但不在面板内的情况一世. 在这种情况下，理想的惩罚负载只是标准集群鲁棒协方差估计器的应用，它提供了对j分数向量的第 th 个元素。集群套索的理想惩罚载荷很简单

ψj=1n吨∑一世=1n在一世j2 在哪里 在一世j:=∑吨X一世j吨e一世吨

## 有限元方法代写

tatistics-lab作为专业的留学生服务机构，多年来已为美国、英国、加拿大、澳洲等留学热门地的学生提供专业的学术服务，包括但不限于Essay代写，Assignment代写，Dissertation代写，Report代写，小组作业代写，Proposal代写，Paper代写，Presentation代写，计算机作业代写，论文修改和润色，网课代做，exam代考等等。写作范围涵盖高中，本科，研究生等海外留学全阶段，辐射金融，经济学，会计学，审计学，管理学等全球99%专业科目。写作团队既有专业英语母语作者，也有海外名校硕博留学生，每位写作老师都拥有过硬的语言能力，专业的学科背景和学术写作经验。我们承诺100%原创，100%专业，100%准时，100%满意。

## MATLAB代写

MATLAB 是一种用于技术计算的高性能语言。它将计算、可视化和编程集成在一个易于使用的环境中，其中问题和解决方案以熟悉的数学符号表示。典型用途包括：数学和计算算法开发建模、仿真和原型制作数据分析、探索和可视化科学和工程图形应用程序开发，包括图形用户界面构建MATLAB 是一个交互式系统，其基本数据元素是一个不需要维度的数组。这使您可以解决许多技术计算问题，尤其是那些具有矩阵和向量公式的问题，而只需用 C 或 Fortran 等标量非交互式语言编写程序所需的时间的一小部分。MATLAB 名称代表矩阵实验室。MATLAB 最初的编写目的是提供对由 LINPACK 和 EISPACK 项目开发的矩阵软件的轻松访问，这两个项目共同代表了矩阵计算软件的最新技术。MATLAB 经过多年的发展，得到了许多用户的投入。在大学环境中，它是数学、工程和科学入门和高级课程的标准教学工具。在工业领域，MATLAB 是高效研究、开发和分析的首选工具。MATLAB 具有一系列称为工具箱的特定于应用程序的解决方案。对于大多数 MATLAB 用户来说非常重要，工具箱允许您学习应用专业技术。工具箱是 MATLAB 函数（M 文件）的综合集合，可扩展 MATLAB 环境以解决特定类别的问题。可用工具箱的领域包括信号处理、控制系统、神经网络、模糊逻辑、小波、仿真等。