STAT 392
Tutorial – Ratio and Regression Estimation
1. Regression Estimation (from Lohr, Ex 3.6.4)
Foresters want to estimate the average age of tress in a stand. Determining age is cumbersome because one needs to count the tree rings on a core taken from the tree. In general, though, the older the tree, the larger the diameter, and diameter is easy to measure. The foresters measure the diameter of all
1132 tress and find that the population mean is 26.2 cm. They then randomly select 20 trees for age measurement. Tree, k
1
2
3
4
5
6
7
8
9
10
Diameter, xk
30.5
29.0
20.1
22.9
26.7
20.1
18.5
25.9
29.7
28.7
Age, yk
125
119
83
85
99
117
69
133
154
168
Tree, k
11
12
13
14
15
16
17
18
19
20
Diameter, xk
14.5
20.3
26.2
30.5
23.4
21.6
17.8
27.2
23.6
20.8
Age, yk
61
80
114
147
122
106
82
88
97
99
¯
(a) Treating the trees as a simple random sample, estimate the mean age of trees in the stand Y , with a variance estimate, 95% confidence interval, and RSE. Comment on the quality of the estimate.
(b) Draw a scatterplot of these data (make sure the x and y axes both start at zero). Fit a regression line y = α +βx+ε to the data, and draw it on to the plot.
(c) Determine whether ratio estimation using diameter as the auxiliary variable would be beneficial.
[You will need to compute the correlation coefficient of x and y, and their respective coefficients of variation.]
¯
(d) Make a ratio estimate of the mean age of trees in the stand Y , with a variance estimate, 95% confidence interval, and RSE. Comment on the quality of the estimate.
i. Fit the zero intercept regression line y = Rx + ε to the data ii. Add this line to your scatterplot.
¯
iii. Estimate Y with
¯
¯
Y R = RX
¯
where X is the population mean value of x. iv. Compute the residuals ek = yk − yk = yk − Rxk
v. Compute the variance of the residuals s2
e