Time allowed: Three hours. Total marks: 60 marks.
There are FIVE questions in this examination; each starts on a fresh page. Answer ALL five questions; start each answer on a fresh page. All questions carry equal marks. The value of each sub-question is indicated in brackets. On the front of your answer book, write the number of each question you have attempted. Statistical tables and useful formulae are attached to this examination paper. Electronic calculators may be used. The examination paper may be retained by the candidate. Answers must be written in black or blue ink. Pencils may be used only for drawing, sketching or graphical work. Show the working steps in your answers.
1
Question 1 (a) Consider the following binomial distribution :
Answer the following based on the above distribution: (i) What can you say about the skewness of this …show more content…
distribution? Explain. [1 mark] (ii) (iii) What are the mean and variance for the number of successes? [2 marks] The experiment that underlies this distribution is whether a project is profitable or not. Success is defined as a project to be profitable. If undertaking a series of these projects is to be well approximated by a binomial random variable, what are the two important assumptions that need to be satisfied? [1 mark]
(b) Windows Vista is a very unstable operating system. You are running your business using a PC with Windows Vista and from experience you expect your computer to crash 3 times a day. If this process is modeled as a Poisson random variable then answer the following: (i) (ii) (iii) What is the probability that your computer works smoothly through the day without crashing at all? [1 mark] If your computer crashes more than five times in one day, you get a free software upgrade. What is the probability of that? [1 mark] If, for every time your computer crashes, you lose $6 in business revenue, what are the mean and standard deviation of your loss per day? [2 marks]
2
(c)
(i)
(ii)
ˆ To estimate a population parameter we have constructed two estimators ~ ~ ˆ and .
Explain what it means to say that is efficient relative to .[1 mark] Production of a gadget takes two tasks that have to be completed in succession. The time it takes to complete task one is distributed normally with mean 35min and standard deviation 6min. The time it takes to complete task two (that starts right after task one is completed) is independent of time it takes for task one and is also distributed normally with mean 22min and standard deviation 3min. What is the probability that the production of a gadget takes no more than one hour? [2 marks]
(iii) After production is completed, every gadget goes into quality assurance. Experience shows that the quality of about 10% of gadgets is rejected. If 360 gadgets enter quality assurance a day, what is the probability that at least 320 gadgets pass the quality test in one day? [1 mark]
3
Question
2
(a) Explain briefly the importance of the Central Limit Theorem for statistical inference. [2 marks] (b) The marketing department of a large store company is interested in finding out how much customers spend when they visit a city centre store. Accordingly, it randomly samples 26 customers who are leaving the store and asks them about their expenditures. The data reveals a sample mean of $125.40 and a sample standard deviation of $27.30. Stating carefully any assumptions you make, construct a 90% confidence interval for the mean value of expenditure by all customers. [3 marks] (c) The store manager knows that a similar survey was conducted last year and that the mean expenditure then was $116. Assuming that consumer price inflation during the year was 3% and that $116 was previously the population mean expenditure, examine whether or not real consumption expenditure at the store has increased in the year. [2 marks] (d) Explain what the Type I and Type II errors are in the context of the example of part (c). [1 mark] (e) Without going into details, describe how you would conduct the hypothesis test of part (c) if the value $116 was also a sample mean derived from a sample, but this time from a sample of 37 clients. [2 marks] (f) How, if at all, would your approach to answering the inference problems of parts (b), (c) and (e) change if each of the two samples of customers alluded to were 10 times as large? [2 marks]
4
Question 3
(a) A telephone survey was conducted by the Australian Chamber of Commerce and Industry (ACCI) to assess business confidence in the future performance of the economy. A total of 1711 responses were obtained from every state, a range of industries and companies of varying sizes. Comment briefly on the appropriateness of this method of sampling and suggest whether you think it may introduce biases. [2 marks] (b) Of the respondents, 296 indicated that they thought the economy would be stronger next year than this. Estimate the sample proportion of firms who are optimistic about economic prospects and use this information to set up a 99% confidence interval for the mean proportion of companies that think the economy will be stronger next year. [3 marks] (c) Suppose that before the survey was undertaken, the ACCI wanted to ensure that the 99% confidence interval (CI) for the proportion of companies that think the economy will be stronger next year that they end up with has a total width of not more than 0.07. How many firms should it have sought to interview? How much smaller would this number be if a confidence interval with a confidence coefficient of 0.8 (i.e. an 80% CI) were required, rather than 0.99? [3 marks] (d) Now assume that a subsample of 50 of these firms is connected to the clothing and footwear sector and they were also asked about the size of typical orders for their product. The mean and standard deviation of this sample were $X and $275, respectively. If a test of the hypothesis H0: μ=1400 against a 2-sided alternative yielded a P-value of 0.05, what was the value of X? [2 marks] (e) Set up a 95% confidence interval for the population variance for the problem of part (d) and explain why it is not symmetrical about the sample value. [2 marks]
5
Question 4
(a) An econometric consultant has been supplied with n observations on a pair of variables (X,Y) and is interested in modelling the relationship between them. Explain what the covariance and also the correlation between the two variables measure and how they can be computed. [2 marks] (b) Consider the simple linear regression model Yi 0 1 X i i , i 1, n , What assumptions are usually made about the error terms in this model and why are they needed? [2 marks]
(c) The following summary output (Table 1) is from a regression equation estimated using Microsoft Excel. The dependent variable (Y) is the List Price of MP3 players in Australia (measured in Australian dollars and ranging from $40 to $194) and the regressor variable (X) is the storage capacity of each (in gigabytes, GB). The model is the simple regression model of part (b). Data has been obtained on 12 music players from which to fit the model. The results (with two deliberate omissions) are given in Table 1.
Table 1. Excel Output for Linear Regression of List price on capacity
Regression Statistics R Square Standard Error of the Regression Observations 0.768 37.305 12 t Stat 0.253 5.759 P-value 0.806 B
Intercept Capacity (i) (ii) (iii) (iv)
Coefficients Standard Error 6.667 A 55.467 9.632
(v) (vi)
What are the values of A and B? [2 marks] What is the interpretation of the two parameter estimates? Do they correspond with your prior expectations? [1 mark] Construct a 95% confidence interval for β1. [1 mark] Test the null hypothesis that β1 = 65 against the one-sided alternative that β1 < 65 and evaluate the P-value of the test. How is the P-value to be interpreted? [1 mark] How is the ‘R Square’ to be interpreted and what does the Standard Error of the Regression tell you? [2 marks] Suppose that a new MP3 player becomes available and its capacity is 4 GB Make a point prediction for the List Price (Y) by using the estimated linear regression model. Comment briefly on how safe you think it is to rely on this prediction. [1 mark]
6
Question 5
(a)
Briefly explain the difference between R2 and the adjusted R2 in a multiple regression model. [2 marks] We are interested in knowing if holding an MBA degree increases CEO pay. Data on 50 different companies are gathered. The following is a section of that data:
Pay ($1000) 1530 1117 602 1170 1086 … Years in Position 7 6 3 6 9 … Change in Stock Price (%) 48 35 9 37 34 … Change in Sales (%) 89 19 24 8 28 … Has MBA? Yes Yes No Yes No …
(b)
Apart from data on MBA degree, other explanatory variables on the performance of companies are also included. Let MBA=1 (if holding MBA), =0 (otherwise). Then a linear regression model is run with Pay as Y and the rest of variables as X’s. The result is reported in the following table:
Regression Statistics R Square 0.748503 Adjusted R Square 0.726148 Standard Error 422.4027 Observations 50 Standard Error 155.914 33.51356 2.674151 1.315096 139.8686
Intercept MBA Chg Sales Chg Stock Years Pos.
Coefficients -1.59333 190.5883 1.588378 1.01268 304.7163
t Stat -0.01022 5.6869 0.593975 0.770043 2.17859
P-value 0.991892 9.08E-07 0.555504 0.445298 0.03464
(i) (ii)
Build a 95% confidence interval for the coefficient on Change in Sales. [2 marks] We wish to test if holding an MBA increases CEO pay. Conduct a suitable hypothesis test at a 5% significance level. Give the critical region for your test and the conclusion you reach. [2 marks] Does the change in stock prices have a significant effect on CEO pay? State your null hypothesis, test it using a 10% significance level, and state your conclusion. [2 marks]
(iii)
7
(iv)
You are looking at a company whose stock price has gone up by 11% and whose sales have gone down by 5%. The CEO does not hold an MBA but has been in this job for five years. What do you predict as the pay of this CEO? [1 mark] How well does the regression model fit the data? Explain briefly. [1 mark] Think of another explanatory variable that was not included as X but might affect the estimates. Suggest why it is important and could affect the results (just name one). [2 marks]
(v) (vi)
8