## 数学代写|运筹学作业代写operational research代考|Bridge Estimators

Penalized estimation methods such as penalized linear least squares and penalized likelihood, have drawn much attention in recent years, and it has been used quite a lot by many researchers because they present a method for selecting of variables and estimating of parameters simultaneously in linear regression given as
$$y_i=x_i^T \delta+\varepsilon_i, i=1,2, \ldots, n .$$
Here, $y_i \in o$ is a $i$-th response variable, $x_i(i=1,2, \ldots, n)$ is a $p$-vector of covariates, $\delta$ is a $p$-vector of unknown parameters, and $\varepsilon_i$ is an $n$-vector of identically distributed, independent random errors. The bridge estimation method suggested by Frank and Friedman in 1993 [11] consists of a large class of the penalty methods considering penalty function $\sum\left|\delta_j\right|^\alpha$ with $\alpha>0$. Bridge estimator, $\hat{\delta}B$, can be determined by solving optimization problem given as $$\operatorname{minimize}\delta \sum_{i=1}^n\left(y_i-x_i^T \delta\right)^2+\varphi \sum_{j=1}^p\left|\delta_j\right|^\alpha,$$
where $|$.$| is the L_2$-norm of the vector, $\varphi$ is a penalty parameter that provides a trade off between the first and the second term. As seen in Equation 1.3, the objective function is penalized by the $L_\alpha$-norm to obtain bridge estimator $\hat{\delta}B$ and it shrinks the estimates of the parameters in Equation $1.2$ towards 0 . Liu et al. in 2007 [22] discussed the effect of the $L\alpha$ penalty with different cases of $\alpha$. If $\alpha=1$, bridge estimation produces Lasso (Least Absolute Shrinkage Operator) [32]; if $\alpha=2$, it produces Ridge or Tikhonov regularization [17] estimation. For $\alpha \leq 1$, the bridge estimator manages to select significant variables for the regression model by shrinking small $\left|\delta_j\right|$ s to exact zeros. However, Knight and Fu in 2000 [20] handled the asymptotic distributions of bridge estimators when the number of covariates is fixed and they noted that the amount of shrinkage towards zero increases with the magnitude of the regression coefficients being estimated in case of $\alpha>1$.

Also, Liu et al. in 2007 [22] pointed out that in penalized estimation problem, to obtain acceptable bias for large parameters, the value of $\alpha$ is not chosen too high than necessary.

## 数学代写|运筹学作业代写operational research代考|Construction of additive nonparametric component

In this section, we present the form of the bridge penalty for PNLM. Let us consider $\left{\left(y_i, x_i, u_i\right), i=1,2, \ldots, n\right}$ a random sample from model expressed by Equation 1.4. The nonlinear least squares objective function for Equation $1.4$ is written as
$$Q(h, \delta)=\sum_{i=1}^n\left(y_i-f\left(x_i, \delta-h\left(u_i\right)\right)^2 .\right.$$
Estimation procedure for parameters in the PNLM consists of two-step through Equation 1.4. In the first step, a linear approximation of $f(., \delta)$ is taken into consideration through Taylor expansion since $f(., \delta)$ is nonlinear with respect to $\delta$. Then, we try to find the minimizer of Equation $1.4$ by solving normal equations obtained from the derivative of $Q(h, \delta)$ concerning $\delta$. However, normal equations cannot be solved analytically due to their non-linearity; therefore, iterative techniques such as the Newton-Raphson algorithm should be used [3]. By applying Taylor expansion to $f(., \delta)$, at $\hat{\delta}c$ where $\hat{\delta}_c$ is a consistent estimate of $\delta$ as an initial point, thus, we get $$f(x, \delta)=f\left(x, \hat{\delta}_c\right)+f^{\prime}\left(x, \hat{\delta}_c\right)^T\left(\delta-\hat{\delta}_c\right)+o_p\left(\left|\delta-\hat{\delta}_c\right|\right) .$$ The initial point $\hat{\delta}_c$ in Equation $1.5$ is obtained from the solution of the following nonlinear least squares optimization problem: $$\left.\hat{\delta}_c=\operatorname{argmin}\delta \sum_{i=1}^n\left(y_{(i+1)}-y_{(i)}-f\left(x_{(i+1}\right), \delta\right)+f\left(x_{(i)}, \delta\right)\right)^2,$$
where $\left(x_{(i)}, t_{(i)}, y_{(i)}\right)(i=1,2, \ldots, n)$ is an ordered sample from the smallest to the largest according to the value of the variable $u_i$ [35]. Li and Mei in 2013 [21] have shown that under some conditions, $\hat{\delta}_c$ is root $n$ consistent. Thus, the $i$ th sample for response variable $Y_i, y_i$ can be written as
$$y_i=f\left(x_i, \hat{\delta}_c\right)+f^{\prime}\left(x_i, \hat{\delta}_c\right)^T\left(\delta-\hat{\delta}_c\right)+h\left(u_i\right)+\varepsilon_i .$$

$$Q(h, \delta)=\sum_{i=1}^n\left(y_i-f\left(x_i, \delta-h\left(u_i\right)\right)^2\right.$$
PNLM 中参数的估计过程包括通过公式 $1.4$ 的两步。第一步，线性近似 $f(., \delta)$ 通过泰勒展开被考虑在 内，因为 $f(., \delta)$ 是非线性的 $\delta$. 然后，我们尝试找到方程的最小值 $1.4$ 通过求解从导数获得的正规方程 $Q(h, \delta)$ 关于 $\delta$. 但是，由于正规方程的非线性，无法对其进行解析求解；因此，应该使用 NewtonRaphson 算法等迭代技术 [3]。通过将泰勒展开应用于 $f(., \delta)$ ，在 $\hat{\delta} c$ 在哪里 $\hat{\delta}c$ 是一致的估计 $\delta$ 作为初始 点，因此，我们得到 $$f(x, \delta)=f\left(x, \hat{\delta}_c\right)+f^{\prime}\left(x, \hat{\delta}_c\right)^T\left(\delta-\hat{\delta}_c\right)+o_p\left(\left|\delta-\hat{\delta}_c\right|\right) .$$ 初始点 $\hat{\delta}_c$ 在等式中 $1.5$ 从以下非线性最小二乘优化问题的解决方案中获得: $$\left.\hat{\delta}_c=\operatorname{argmin} \delta \sum{i=1}^n\left(y_{(i+1)}-y_{(i)}-f\left(x_{(i+1}\right), \delta\right)+f\left(x_{(i)}, \delta\right)\right)^2$$

$$y_i=f\left(x_i, \hat{\delta}_c\right)+f^{\prime}\left(x_i, \hat{\delta}_c\right)^T\left(\delta-\hat{\delta}_c\right)+h\left(u_i\right)+\varepsilon_i$$

