McGraw-Hill Ryerson Mathematics of Data Management, pp. 202–211
1. Edwin compares the street address numbers of three of his neighbours with the quality of their front lawn, which he rates on a scale of 1 to 10. He observes a positive linear correlation and concludes that people with higher street address numbers have better lawns. In Edwin’s study: a) What are the independent and dependent variables?
The independent variable is the street number and the dependent number is rating of the lawn. b) What is wrong with the way that the dependent variable is measured?
It is based on Edwin’s view of lawn quality rather than an objective standard. c) Classify the type of cause and effect that …show more content…
is most likely to explain the correlation.
Accidental is most likely.
Only three data points are in the study, so a single data point makes a huge difference. d) What is wrong with the sampling technique? Explain how it could be improved.
Edwin chooses three neighbours. He needs to randomize his sampling both in terms of street numbers and streets. He also needs a much larger sample.
2. A radio talk show host invites listeners to respond to her stated opinion that our country’s immigration policies are weak. Near the end of her show, she tallies that 68% of callers agreed with her statement, and then concludes that over two thirds of Canadians feel that our immigration policies are weak. a) What type of bias is present in this study? Non-response bias is present. b) Explain why the sampling technique may be responsible for introducing this bias. Voluntary sampling where the show host is likely to treat those who disagree rudely will usually result in the participation of people who agree with the host. c) Would you classify this bias as intentional or unintentional? Explain. It is difficult to classify without knowing whether the host understands basic statistics, but I would guess that it is intentional.
Use the information in the following table to answer questions 3 to
5.
Consider the following correlation between shoe size and science test scores.
|Shoe Size |7 |9 |6 |8 |11 |
|Test Scores |84 |81 |66 |72 |63 |
3. a) Create a scatter plot and classify the linear correlation. Weak, negative,. b) Perform a linear regression and determine the equation of the line of best fit and the correlation coefficient. Is this an effective model? Explain. y = -1.2973x + 83.838 r= 0.27
This is not an effective model, the correlation is very weak.
4. Consider the non-linear regression equation: y = (2.525x4 + 84.25x3 – 1039.4x2 + 5620.3x – 11 164 a) What type of non-linear regression was performed to generate this equation? Polynomial, degree 4. b) Perform this regression and determine the coefficient of determination. Plot the curve through the data points. Is this an effective model? Explain why or why not. R2 = 1 This is an ineffective model as there are only five points, so a 4th degree polynomial regression will result in a perfect fit every time, regardless of whether it would model a larger number of points successfully.
5. a) Describe how the study could be improved in order to develop a better model.
A larger sample would improve the study considerably. c) What type of relationship do you think exists between these two variables? Explain. The variables are unlikely to be related. d) Explain the danger of blindly using curve-fitting techniques as mathematical models. A high enough degree polynomial can fit any data set well, so we must not only consider the r2 value, but also whether the equation is likely to model the phenomenon of interest.
6. Often in professional sports, when a league expands (admits new teams), the existing teams collectively enjoy greater success at the expense of the new teams. Suppose that a certain professional league underwent a major expansion during 1994–1995. a) Explain how this expansion might act as a hidden variable when examining the progress of an existing team over the 1990s decade.
New teams have mostly new players, new coaches and little chemistry. As a result, they are likely beaten by experienced teams. So a league expansion will improve the stats of experienced teams. b) Sketch what you think a time-series graph of team achievement might look like during this decade, if the skill level of the team: i) remains relatively constant ii) steadily improves iii) steadily declines
[pic]