Suppose we ask 1000 people what their age is. If this is a representative sample then there will be very few people of 1-2 years old just as there will not be many 95 year olds. Most will have an age somewhere in their 30’s or 40’s. A list of the number of people of a certain age may look like this:
|Age |Number of people |
|0 |1 |
|1 |2 |
|2 |3 |
|3 |8 |
|.. |.. |
|.. |.. |
|30 |45 |
|31 |48 |
|.. |.. |
|.. |.. |
|60 |32 |
|61 |30 |
|.. |.. |
|.. |.. |
|80 |6 |
|81 |3 |
Next, we can turn this list into a scatter diagram with age on the horizontal axis and the number of people of a certain age on the vertical axis.
[pic]
From the statistical point of view a scatter diagram may have two shapes.
It may be shaped or at least looks approximately like a 'bell curve ', which looks like this:
A 'bell curve ' is perfectly symmetrical with respect to a vertical line through its peak and is sometimes called a "Gauss curve" or a "normal curve".
The second shape a scatter diagram may have is anything but a normal curve as in the next drawing:
We can do a lot of good statistics with the normal curve, but virtually none with any other curve.
Let us assume that we have recorded the 1000 ages and computed the mean and standard deviation of these ages. Assuming the mean age came out as 40 years and the standard deviation as