### 统计代写|统计推断作业代写statistical inference代考|Elements of Bayesianism

Bayesian inference involves placing a probability distribution on all unknown quantities in a statistical problem. So in addition to a probability model for the data, probability is also specified for any unknown parameters associated with it. If future observables are to be predicted, probability is posed for these as well. The Bayesian inference is thus the conditional distribution of unknown parameters (and/or future observables) given the data. This can be quite simple if everything being modeled is discrete, or quite complex if the class of models considered for the data is broad.
This chapter presents the fundamental elements of Bayesian testing for simple versus simple, composite versus composite and for point null versus composite alternative, hypotheses. Several applications are given. The use of Jeffreys “noninformative” priors for making general Bayesian inferences in binomial and negative binomial sampling, and methods for hypergeometric and negative hypergeometric sampling, are also discussed. This discussion leads to a presentation and proof of de Finetti’s theorem for Bernoulli trials. The chapter then gives a presentation of another de Finetti result that Bayesian assignment of probabilities is both necessary and sufficient for “coherence,” in a particular setting. The chapter concludes with a discussion and illustration of model selection.

## 统计代写|统计推断作业代写statistical inference代考|TESTING A COMPOSITE VS. A COMPOSITE

Suppose we can assume that the parameter or set of parameters $\theta \in \Theta$ is assigned a prior distribution (subjective or objective) that is, we may be sampling $\theta$ from a hypothetical population specifying a probability function $g(\theta)$, or our beliefs about $\theta$ can be summarized by a $g(\theta)$ or we may assume a $g(\theta)$ that purports to reflect our prior ignorance. We are interested in deciding whether
$$H_{0}: \theta \in \Theta_{0} \quad \text { or } \quad H_{1}: \theta \in \Theta_{1} \quad \text { for } \quad \Theta_{0} \cap \Theta_{1}=\emptyset$$

Now the posterior probability function is
$$p(\theta \mid D) \propto L(\theta \mid D) g(\theta) \quad \text { or } \quad p(\theta \mid D)=\frac{L(\theta \mid D) g(\theta)}{\int_{\Theta} L(\theta \mid D) d G(\theta)}$$
using the Lebesgue-Stieltjes integral representation in the denominator above.
Usually $\Theta_{0} \cup \Theta_{1}=\Theta$ but this is not necessary. We calculate
\begin{aligned} &P\left(\theta \in \Theta_{0} \mid D\right)=\int_{\Theta_{0}} d P(\theta \mid D) \ &P\left(\theta \in \Theta_{1} \mid D\right)=\int_{\Theta_{1}} d P(\theta \mid D) \end{aligned}
and calculate the posterior odds
$$\frac{P\left(\theta \in \Theta_{0} \mid D\right)}{P\left(\theta \in \Theta_{1} \mid D\right)}$$
and if this is greater than some predetermined value $k$ choose $H_{0}$, and if less choose $H_{1}$, and if equal to $k$ be indifferent.

## 统计代写|统计推断作业代写statistical inference代考|SOME REMARKS ON PRIORS FOR THE BINOMIAL

1. Personal Priors. If one can elicit a personal subjective prior for $\theta$ then his posterior for $\theta$ is personal as well and depending on the reasoning that went into it may or may not convince anyone else about the posterior on $\theta$. A convenient prior that is often used when subjective opinion can be molded into this prior is the beta prior
$$g(\theta \mid a, b) \propto \theta^{a-1}(1-\theta)^{b-1}$$
when this is combined with the likelihood to yield
$$g(\theta \mid a, b, r) \propto \theta^{a+r-1}(1-\theta)^{b+n-r-1}$$
2. So-Called Ignorance or Informationless or Reference Priors. It appears that in absence of information regarding $\theta$, it was interpreted by Laplace that Bayes used a uniform prior in his “Scholium”. An objection raised by Fisher to this is essentially on the grounds of a lack of invariance. He argued that setting a parameter $\theta$ to be uniform resulted in, say $\tau=\theta^{3}$ (or $\tau=\tau(\theta))$ and then why not set $\tau$ to be uniform so that $g(\tau)=1,0<\tau<1$ then implies that $g(\theta)=3 \theta^{2}$ instead $g(\theta)=1$. Hence one will get different answers depending on what function of the parameter is assumed uniform.
Jeffreys countered this lack of invariance with the following:
The Fisher Information quantity of a probability function $f(x \mid \theta)$ is
$$I(\theta)=E\left(\frac{d \log f}{d \theta}\right)^{2}$$
assuming it exists. Then set
$$g(\theta)=I^{\frac{1}{2}}(\theta)$$
Now suppose $\tau=\tau(\theta)$. Then
\begin{aligned} I(\tau) &=E\left(\frac{d \log f}{d \tau}\right)^{2}=E\left(\frac{d \log f}{d \theta} \times \frac{d \theta}{d \tau}\right)^{2} \ &=E\left(\frac{d \log f}{d \theta}\right)^{2} \times\left(\frac{d \theta}{d \tau}\right)^{2} \end{aligned}

