Topic 2
Measures of
Central Tendency
These slides are copyright © 2003 by Tavis Barr. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0 or later (the latest version is presently available at http://www.opencontent.org/openpub/).
Measures of Central Tendency
This chapter looks at three different concepts of how we describe a “typical” element of a data set.
Mean
● Median
● Mode
●
There is no one “best” concept for all cases; we will discuss the advantages and disadvantages of each.
Mean
●
●
The mean is what is most commonly called the average. If a population is finite, of size N, we can write the population mean as
EXAMPLE:
●
●
i= N
X i X 1 X 2 ⋯X N
∑ N= N i=1 ●
There are three countries in
North America (N=3)
Their land areas are:
Canada
9,093,507 km2
Mexico
1,923,040 km2
U.S.
9,161,923 km2
Total
2,017,840 km2
Average land area:
2,017,840/3 = 6,726,157 km2
Source: 2005 CIA World Factbook
Mean – sample mean
●
i=n
For a sample of size n, we can write sample mean as
X i X 1 X 2 ⋯X n
∑ n= n i=1 Example:
●
●
●
Ten people are asked how many hours of TV they watched last night.
Their responses are 1, 2, .5,
0, 4, 0, 2, 1.5, 0, 3.
Mean:
1+2+0.5+4+2+1.5+3
=1.4
10
Advantages of the sample mean
1. It takes all values in the sample into account.
2. It is unique: Each sample and population has only one mean.
3. The sum of X minus the mean is zero, so the mean acts as a “balancing point.”
Disadvantages of the Mean
1. It only exists for quantitative data
What is the mean between good, fair, poor?
● Between red, yellow, and blue?
●
2. It can be affected strongly by outliers.
Example: In Whoville, there are 10 people who earn
$10,000 a year and one person who earns $1,000,000
● What is the mean? Is it a