The two-sample t test of the previous section was based on several conditions: independent samples, normality, and equal variances. When the conditions of normality and equal variances are not valia but the sample sizes are large, the
~.. .
~
.
Wilcoxon rank sum test
-
; . .
::::[h, 1
FIGURE 6.7 Skewed population distributions identical in shape but shifted
0.08
0.06
0.04
0.02
0.0
0
10
20
30
"
y, value
ofrandom vasiable
results using a r (or 1') test are approximately correct. There IS, however, an alternative test procedure that requires less stringent conditions. This procedure, -- the Wilcoxon rank sum test, IS discussed here. called The assumptions for this test are that we have independent random samples8 taken from two populations whose distributions are identical except that one distribution may be shifted to the right of the other distribution, as shown in1 Figure 6.7. T h e Wilcoxon rank sum test does not require that populations have( normal distributions. Thus, we have removed one of the three conditions that/ were required of the t-based procedures. The other conditions, equal variancesi and independence of the random samples, are still required for the Wilcoxon rank sum test. Because the two population distributions are assumed to be identical. under the null hypothesis, independent random samples from the two populations1 should be similar if the null hypothesis is true. Because we are now allowing thei population distributions to be nonnormal, the rank sum procedure must deal with1 the possibility of extreme observations in the data. One way to handle samples containing extreme values is t o replace each data value with its rank (from lowest to highest) in the combined sample-that is, the sample consisting of the data' from both populations. T h e smallest value in the combined sample is assigned the rank of 1 and the largest