Regression Project Report
OPRE 433
Tianao Zhang 12/5/2011
Introduction
According to the data I’ve received, there are 6578 observations. The data base is composed by 13 columns and 506 rows. All the explanatory variables are continuous as well as the dependent variable and there are no categorical variables. My goal is to build a regression model to predict the average of Y or particular Y by a given X. 1. Do the regression assumptions such as Constant Variance, Normality and Independence and the correct functional hold for the model? By performing residual analysis, I can test the model. 2. Is there any relationship between the explanatory variables? I do multicollinearity test to test this condition. 3. I want to find out the confidence interval and prediction interval for the average Y and particular Y value. 4. In order to check the usefulness of the model and the relationship between X and Y, I consider several variables: i. Multiple Coefficient of Determination R2 and Radj2) ii. DWT iii. F Ratio iv. VIF value v. P Probability value.
Method of analysis
1. Find the important variables Use “Stepwise” to eliminate unimportant independent variables. Analysis—Fit Model—Stepwise After using “Stepwise”, JMP shows me that column 3 and column 7 should be deleted. So the rest of the columns have strong relationship with the dependent variables. 2. Checking VIF value If some variables’ are greater than 10, it means there is multicollinearity in the model. Fortunately there are no strong correlation exists between two independent variable. In this step, I will keep all the independent variables in the model. 3. Building model with the selected variables I get the model y=668.1274-0.108416*X1+0.0458433*X2+2.7188168*X4-17.37683*X5+3.8015829*X6-1 .492708*X8+0.2996025*X9-0.011777*X10-0.946554*X11+0.0092905*X12-0.52255*X13
4. Check violation of the regression assumption According to Durbin-Watson test, the assumption of independence is violated. So my