### 统计代写|实验设计作业代写experimental design代考|ASSESSING THE TREATMENT MEANS

In the last chapter we desoribed the linear model for a simple experiment, found estimates of the parameters, and described a test for the hypothesis that there $1 s$ no overall treatment effect. In this chapter we cover the next step of examining more closely the pattern of differences among the treatment means. There are a number of approaches, One extreme is to test only the hypotheses framed before the experiment was carried out, but this approach wastes much of the information from the experiment. On the other hand, to carry out conventional hypothesis tests on every effect that looks interesting can be very misleading, for reasons which we now examine.
There are three main difflculties. First, two tests based on the same experiment are unlikely to be independent. Tests will usually involve the same estimate of variance, and if this estimate happens to be too small, every test will be too significant. Further, any comparisons involving the same means will be affected the same way by chance differences between estimate and parameter. As an example consider a case where there are three treatments assessing a new drug. Treatment “A” is the placebo, treatment ” $B^{H}$ the drug administered in one big dose and treatment “C” the orug administered in two half doses. If chance variation happens to make the number of cures on the experimental units using the placebo (A) rather low, the differences between $A$ and $B$, and $A$ and $C$ will both be overstated in the same way. Therefore two significant t-tests, one between $A$ and $B$, the other between $A$ and $C$, cannot be taken as independent corroboration of the effectiveness of the drug.

## 统计代写|实验设计作业代写experimental design代考|SPECIFIC HYPOTHESES

Any experiment should be designed to answer specif ic questions. If these questions are stated clearly it will be possible to construct a single linear function of the treatment means which answers each question. It can be a difficult for the statistician to discover what these questions are, but this type of problem is beyond the scope of this book. We will present some examples.
Example 6.2+1 Drug comparison example
In the drug comparison experiment mentioned in Section 1 one question might be, is the drug effective? Rather than doing two separate

tests ( $A \vee B$ and A $\vee$ C), a single test of $A$ against the average of $B$ and $C$ gives an unambiguous answer which uses all the relevant data. That is use
$\bar{y}{A}-\left(\bar{y}{B}+\bar{y}{C}\right) / 2$ with variance $\left[1 / r{A}+\left(1 / r_{B}+1 / r_{B}\right) / 4\right] \sigma^{2}$
Having decided that the drug has an effect the next question may be, how much better is two half doses than one complete dose. This will be estimated by the difference between treatment means for $B$ and C. For inferences remember that $\sigma^{2}$ is estimated by $s^{2}$, and this appears with its degrees of freedom in the ANOVA table.

## 统计代写|实验设计作业代写experimental design代考|Exper imentwise Error Rate

The above are examples of inferences to answer specific questions. Each individual inference will be correot, but there are several inferences being made on each experiment. If all four suggested comparisons were made on the fertilizer experiment, the probability of making at least one type I error will be much higher than the signif icance level of an individual test. If the traditional $5 \%$ level is used, and there are no treatment effeots at all, and the individual tests were independent, the number of significant results from the experiment would be a binomial random variable with $n=4$ and $p=.05$. The probability of no significant results will be $(1-0.05)^{4}$, so that the probability of at least one will be $1-(1-0.05)^{4}=$ 0.185. If one really wanted to have the error rate per experiment equal to $0.05$ each individual test would have to use a significance level, $p$, satisfying
\begin{aligned} 1-(1-p)^{4} &=0.05 \ \text { or } & p=0.013 \end{aligned}

Unfortunately, the underlying assumptions are false because, as we noted in Section 1, each inference is not independent. The correlation between test statistics will usually be positive because each depends on the same variance estimate, and so the probability of all four being nonsignificant w1ll be greater than that calculated above and so the value of p given above will be too low. If the error rate per experiment is important, the above procedure at least provides a lower bound. Usually though it is suffleient to be suspicious of experiments producing many significant results, particularly if the var fance estimate $1 s$ based on rather few degrees of freedom and is smaller than is usually found in similar experiments. Experimenters should not necessarily be congratulated on obtaining many significant results.

In section 1, another source of dependence was mentioned. This results from the same treatment means being used in different conparisons. If the questions being asked are themselves not independent, the inferences cannot be either. However, it is possible to design a treatment structure so that independent questions can be assessed independently. This will be the topic of the next section.

## 统计代写|实验设计作业代写experimental design代考|Exper imentwise Error Rate

1−(1−p)4=0.05  或者 p=0.013

