统计代写|实验设计作业代写experimental design代考| Forward Selection

We shall use the heart data of the last two sections to illustrate this. In section 3.5, this data is written in correlation form. If the model is to include only one predictor variable, then $B$ would be chosen as it gives the highest SSR which is also the correlation coefficient with $y$. Before B is placed in the model, we test that it has a significant effect on $y$ by using an $F$-test, or equivalently, a t-test.
We test $\mathrm{H}: B_{2}=0$ in the model $y=B_{0}+B_{2} x_{2}+\varepsilon$
$$F=(S S R / 1) /(\mathrm{SSE} / 44)=0.657 /(0.342 / 44)=84.7$$
Clearly, we reject $H$ and include $x_{2}$ in the model. We now try to add

another predictor variable to the model. We look for the variable which, with B, gives the nighest value of SSR. From Table 3.6.1, we see that SSR for $B$ and $C=0.715$ which $1 s$ greater than for $A B=$ $0.667, \mathrm{BD}=0.686, \mathrm{FB}=0.676$, and $\mathrm{BE}=0.659$. Does $\mathrm{C}$ add significantly to SSR over and above B itself? We use the method of reduced models to determine this.
Full model: $y=\beta_{0}+\beta_{2} x_{2}+B_{3} x_{3}+\varepsilon ; \quad$ SSE $=0.285$
Reduced model: $y=\beta_{0}+\beta_{2} x_{2}+\varepsilon ; \quad$ SSE $=0.342$
Difference $=0.05 ?$
$F=0.057 /(0.285 / 43)=8.6$
The tabulated $F$ at $5 \%$ level with 1,43 degrees of freedom = $4.190$ that we reject the reduced model in favor of the full model. We then attempt to add a third variable to the model. The vari= able which adds most to SSR in association with $B$ and $C$ is D as BCD gives an $\mathrm{SSR}=0.718 .$ An $\mathrm{F}$ statistic is evaluated to determine whether $D$ adds significantly to SSR over and above $B$ and $C$.

Ful1 mode1: $y=\beta_{0}+\beta_{2} x_{2}+B_{3} x_{3}+B_{4} x_{4}+\varepsilon ; S S E=0.282$
Reduced model: $y=\beta_{0}+\beta_{2} x_{2}+\beta_{3} x_{3}+E ; \quad$ SSE $=0,285$
Difference $=0.003$
$$F=0.003 /(0.282 / 42)=0.4$$
Clearly this is too small to reject the reduced model and we select as the optimal model that with B and $C$ as predictor variables.

统计代写|实验设计作业代写experimental design代考|Backward Elimination

Another approach is to commence with the full model of six prediotor variables and to attempt to remove variables sequentially. In Table 3.6.1, the five predictor model with the greatest SSR Is BCDEF $0.749$ or SSE $=0.251$. To declde if A should be removed, we conpare this SSE with that of the full model using the F statistio.
$$F=(0.251-0.247) /(0.247 / 39)=0.004 /(0.247 / 39)=0.6$$

Clearly, the effect of $A$ is not significant and can be removed. We look to remove one of these remaining five variables by considering the SSR for each of the four predictor models. These are:
$\mathrm{BCDE}=0.719, \quad \mathrm{CDEF}=0.703, \quad \mathrm{BDEF}=0.712, \quad \mathrm{FBCD}=0.721$
and $\mathrm{BEFC}=0.741$
We choose this last one with $D$ omitted and test whether this causes a significant reduction in SSR. The full model is now BCDEF and the reduced model is BCEF.
$$F=0.008 /(0.251 / 40)=1.3$$
This value is low compared with the $5 \%$ tabulated value for 1 and 40 degrees of freedom which equals $4.08$ so that we proceed to eliminate a further variable. The three variable sums of squares are
$$C E F=0.684, \quad F B C=0.717, \quad B E F=0.704, \quad E B C=0.717$$
For either of the models with SSR $=0.717$,
$$E=0.024 /(0.259 / 41)=3.8$$
This is slightly below the oritical value of $4.08$ so that we proceed and compare these two models, FBC and $\mathrm{EBC}$, with $\mathrm{BC}$ the last subset of these with two variables.
$$F=0.002 /(0.283 / 42)=0.3$$
We are then reduced to the model BC as in the forwand selection process.
There are a number of points to notice about these sequential methods.

统计代写|实验设计作业代写experimental design代考|QUALITATIVE (DUMMY) VARIABLES

It is of ten useful to introduce variables into a model to enable certain specifio effects to be revealed and tested. Usually these take the form of qualitative variables which show up the differences

between subgroups in the data. We shall use an example to explore these ideas.

In Example 1.5.1, we listed the value of an Australian stamp (1963 twopenny sepia coloured in the years 1972-1980). We could compare this with the listed value of another stamp, and for few obvious reasons, we have chosen the 1867 New Zealand fourpenny rose colored full face queen. We shall use the same transformation as before, namely
$$y_{1}=\ln v_{t}, y_{2}=\ln v_{t}$$
for the Australian and New zealand stamp respectively. The data is given in Table 3.8.1. We could $f$ it a separate model to each stamp, that is, for the Australian stamp
$$y_{1}=a_{1} 1+\alpha_{2} t_{1}+\varepsilon_{1}$$
and the New Zealand stamp
$$\boldsymbol{y}{2}=B{1} \mathbf{1}+B_{2} \mathrm{t}{2}+\varepsilon{2}$$
If the distributions of the deviations can be assumed to be the same, it will be adrisable to join these models into a single model.

统计代写|实验设计作业代写experimental design代考|Forward Selection

F=(小号小号R/1)/(小号小号和/44)=0.657/(0.342/44)=84.7

F=0.057/(0.285/43)=8.6

F=0.003/(0.282/42)=0.4

统计代写|实验设计作业代写experimental design代考|Backward Elimination

F=(0.251−0.247)/(0.247/39)=0.004/(0.247/39)=0.6

F=0.008/(0.251/40)=1.3

C和F=0.684,F乙C=0.717,乙和F=0.704,和乙C=0.717

F=0.002/(0.283/42)=0.3

