### 统计代写|应用随机过程代写Stochastic process代考|Hypothesis testing

In principle, hypothesis testing problems are straightforward. Consider the case in which we have to decide between two hypotheses with positive probability content, that is, $H_{0}: \theta \in \Theta_{0}$ and $H_{1}: \theta \in \Theta_{1}$. Then, theoretically, the choice of which hypothesis to accept can be treated as a simple decision problem (see Section 2.3). If we accept $H_{0}\left(H_{1}\right)$ when it is true, then we lose nothing. Otherwise, if we accept $H_{0}$, when $H_{1}$ is true, we lose a quantity $l_{01}$ and if we accept $H_{1}$, when $H_{0}$ is true, we lose a quantity $l_{10}$. Then, given the posterior probabilities, $P\left(H_{0} \mid \mathbf{x}\right)$ and $P\left(H_{1} \mid \mathbf{x}\right)$, the expected loss if we accept $H_{0}$ is given by $P\left(H_{1} \mid \mathbf{x}\right) l_{01}$ and the expected loss if we accept $H_{1}$ is $P\left(H_{0} \mid \mathbf{x}\right) l_{10}$. The supported hypothesis is that which minimizes the expected loss. In particular, if $I_{01}=l_{10}$ we should simply select the hypothesis which is most likely a posteriori.

In many cases, such as model selection problems or multiple hypothesis testing problems, the specification of the prior probabilities in favor of each model or hypothesis may be very complicated and an alternative procedure that is not dependent on these prior probabilities may be preferred. The standard tool for such contexts is the Bayes factor.

Definition 2.1: Suppose that $H_{0}$ and $H_{I}$ are two hypotheses with prior probabilities $P\left(H_{0}\right)$ and $P\left(H_{1}\right)$ and posterior probabilities $P\left(H_{0} \mid \mathbf{x}\right)$ and $P\left(H_{1} \mid \mathbf{x}\right)$, respectively. Then, the Bayes factor in favor of $H_{0}$ is
$$B_{1}^{0}=\frac{P\left(H_{1}\right) P\left(H_{0} \mid \mathbf{x}\right)}{P\left(H_{0}\right) P\left(H_{1} \mid \mathbf{x}\right)}$$
It is easily shown that the Bayes factor reduces to the marginal likelihood ratio, that is,
$$B_{1}^{0}=\frac{f\left(\mathbf{x} \mid H_{0}\right)}{f\left(\mathbf{x} \mid H_{1}\right)}$$
which is independent of the values of $P\left(H_{0}\right)$ and $P\left(H_{1}\right)$ and is, therefore, a measure of the evidence in favor of $H_{0}$ provided by the data. Note, however, that, in general, it is not totally independent of prior information as
$$f\left(\mathbf{x} \mid H_{0}\right)=\int_{\Theta_{0}} f\left(\mathbf{x} \mid H_{0}, \theta_{0}\right) f\left(\theta_{0} \mid H_{0}\right) d \theta_{0}$$
which depends on the prior density under $H_{0}$, and similarly for $f\left(\mathbf{x} \mid H_{1}\right)$. Kass and Raftery (1995) presented the following Table 2.1, which indicates the strength of evidence in favor of $H_{0}$ provided by the Bayes factor.

## 统计代写|应用随机过程代写Stochastic process代考|Prediction

In many applications, rather than being interested in the parameters, we shall be more concerned with the prediction of future observations of the variable of interest. This is especially true in the case of stochastic processes, when we will typically be interested in predicting both the short- and long-term behavior of the process.

For prediction of future values, say $\mathbf{Y}$, of the phenomenon, we use the predictive distribution. To do this, given the current data $\mathbf{x}$, if we knew the value of $\boldsymbol{\theta}$, we would use the conditional predictive distribution $f(\mathbf{y} \mid \mathbf{x}, \boldsymbol{\theta})$. However, since there is uncertainty about $\theta$, modeled through the posterior distribution, $f(\theta \mid \mathbf{x})$, we can integrate this out to calculate the predictive density
$$f(\mathbf{y} \mid \mathbf{x})=\int f(\mathbf{y} \mid \theta, \mathbf{x}) f(\boldsymbol{\theta} \mid \mathbf{x}) \mathrm{d} \theta$$
Note that in the case that the sampled values of the phenomenon are conditionally IID, the formula (2.3) simplifies to
$$f(\mathbf{y} \mid \mathbf{x})=\int f(\mathbf{y} \mid \boldsymbol{\theta}) f(\boldsymbol{\theta} \mid \mathbf{x}) \mathrm{d} \theta$$
although, in general, to predict the future values of stochastic processes, this simplification will not be available. The predictive density may be used to provide point or set forecasts and test hypotheses about future observations, much as we did earlier.
Example 2.6: In the normal-normal example, to predict the next observation $Y=X_{n+1}$, we have that in the case when a uniform prior was applied, then

$X_{n+1} \mid x \sim \mathrm{N}\left(\bar{x}, \frac{n+1}{n} \sigma^{2}\right)$. Then a predictive $100(1-\alpha) \%$ probability interval is
$$\left[\bar{x}-z_{\alpha / 2} \sigma \sqrt{(n+1) / n}, \bar{x}+z_{\alpha / 2} \sigma \sqrt{(n+1) / n}\right] \text {. }$$

## 统计代写|应用随机过程代写Stochastic process代考|Sensitivity analysis and objective Bayesian methods

As mentioned earlier, prior information may often be elicited from one or more experts. In such cases, the postulated prior distribution will often be an approximation to the expert’s beliefs. In case that different experts disagree, there may be considerable uncertainty about the appropriate prior distribution to apply. In such cases, it is important to assess the sensitivity of any posterior results to changes in the prior distribution. This is typically done by considering appropriate classes of prior distributions, close to the postulated expert prior distribution, and then assessing how the posterior results vary over such classes.

Example 2.8: Assume that the gambler in the gambler’s ruin problem is not certain about her $\operatorname{Be}(5,5)$ prior and wishes to consider the sensitivity of the posterior predictive ruin probability over a reasonable class of alternatives. One possible class of priors that generalizes the gambler’s original prior is
$$G={f: f \sim \operatorname{Be}(c, c), c>0},$$
the class of symmetric beta priors. Then, over this class of priors, it can be shown that the gambler’s posterior predictive ruin probability varies between $0.231$, when $c \rightarrow 0$ and $0.8$, when $c \rightarrow \infty$. This shows that there is a large degree of variation of this predictive probability over this class of priors.

When little prior information is available, or in order to promote a more objective analysis, we may try to apply a prior distribution that provides little information and ‘lets the data speak for themselves’. In such cases, we may use a noninformative prior. When $\boldsymbol{\Theta}$ is discrete, a sensible noninformative prior is a uniform distribution. However, when $\boldsymbol{\Theta}$ is continuous, a uniform distribution is not necessarily the best choice. In the univariate case, the most common approach is to use the Jeffreys prior.
Definition 2.3: Suppose that $X \mid \theta \sim f(\cdot \mid \theta)$. The Jeffreys prior for $\theta$ is given by
$$f(\theta) \propto \sqrt{I(\theta)},$$
where $I(\theta)=-E_{X}\left[\frac{d^{2}}{d \theta^{2}} \log f(X \mid \theta)\right]$ is the expected Fisher information.

## 统计代写|应用随机过程代写Stochastic process代考|Hypothesis testing

B10=P(H1)P(H0∣x)P(H0)P(H1∣x)

B10=f(x∣H0)f(x∣H1)

f(x∣H0)=∫Θ0f(x∣H0,θ0)f(θ0∣H0)dθ0
H0f(x∣H1)H0

## 统计代写|应用随机过程代写Stochastic process代考|Prediction

f(y∣x)=∫f(y∣θ,x)f(θ∣x)dθ

f(y∣x)=∫f(y∣θ)f(θ∣x)dθ

Xn+1∣x∼N(x¯,n+1nσ2)。那么一个预测的概率区间是100(1−α)%
[x¯−zα/2σ(n+1)/n,x¯+zα/2σ(n+1)/n].

## 统计代写|应用随机过程代写Stochastic process代考|Sensitivity analysis and objective Bayesian methods

G=f:f∼Be⁡(c,c),c>0,
0.231c→00.8c→∞. 这表明这种预测概率在此类先验上存在很大程度的变化。

X∣θ∼f(⋅∣θ)θ
f(θ)∝I(θ),
I(θ)=−EX[d2dθ2log⁡f(X∣θ)]是预期的Fisher信息.

