Dale Berger, Claremont Graduate University http://wise.cgu.edu
The purpose of this paper is to explain the logic and vocabulary of one-way analysis of variance (ANOVA). The null hypothesis tested by one-way ANOVA is that two or more population means are equal. The question is whether (H0) the population means may equal for all groups and that the observed differences in sample means are due to random sampling variation, or (Ha) the observed differences between sample means are due to actual differences in the population means.
The logic used in ANOVA to compare means of multiple groups is similar to that used with the t-test to compare means of two independent groups. When one-way ANOVA …show more content…
Assumption 1 is crucial for any inferential statistic. As with the t-test, Assumptions 2 and 3 can be relaxed when large samples are used, and Assumption 3 can be relaxed when the sample sizes are roughly the same for each group even for small samples. (If there are extreme outliers or errors in the data, we need to deal with them first.) As a first step, we will review the t-test for two independent groups, to prepare for an extension to ANOVA.
Review of the t-test for independent groups
Let us start with a small example. Suppose we wish to compare two training programs in terms of performance scores for people who have completed the training course. The table below shows the scores for six randomly selected graduates from each of two training programs. These (artificially) small samples show somewhat lower scores from the first program than from the second program. But, can these fluctuations be attributed to chance in the sampling process or is this compelling evidence of a real difference in the populations? The t-test for independent groups is designed to address just this question by testing the null hypothesis H0: (1 = (2. We will conduct a standard t-test for two independent groups, but will develop the logic in a way that can be extended easily to more than two …show more content…
The logic of this approach extends directly to one-way analysis of variance with k groups. We can use our data to calculate two independent estimates of the population variance: one is the pooled variance of scores within groups, and the other is based on the observed variance between group means. These two estimates are expected to be equal if the population means are equal for all k groups (H0: (1 = (2 = …= ( k), but the estimates are expected to differ if the population means are not all the same.
Within-groups estimate. Our single best estimate of the population variance is the pooled within groups variance, sy2 from Formula 2. In our example sy2 = 18, with df = 10. In ANOVA terminology, the numerator of Formula 2 is called the Sum of Squares Within Groups, or SSWG, and the denominator is called the degrees of freedom Within Groups, or dfWG. The estimate of the population variance from Formula 2, SSWG/dfWG, is called the Mean Square Within Groups, or MSWG. Formula 3 is an equivalent way to express and compute MSWG.
Within-groups estimate of (y2 = [pic] [Formula 3]