The Problem Statement:
• Dupree sell heating oil to residential customers
• Customers may run out of oil
• Dupree wants to guarantee that the customer’s oil tank will never run dry.
• Dupree pledges “50 free gallons” in case a tank runs dry• To estimate customers’ oil usage, the home heating industry uses the concept of “degree days.”
Degree day is equal to the difference between the average daily temperature and 68 degrees
Fahrenheit. For ex. (68 – 50) = 18 (if negative, it will be changed to “zero”).Using degree days and the tank size, the oil industry can estimate when the customer is getting low on fuel and when to resupply the customer.• The data gathered from customers is given in the
DUPREE.XLS file:
The # of gallons of oil_usage and the # of degree days since last oil fill for 40 customers # of people residing in homes of the 40 customers (more hot water usage)
Assessment by staff, of home type classification (1-5), is a composite index of the home size, age, exposure to wind, level of insulation, and furnace type. A low home_factor index implies a lower oil consumption per degree day.• Use data in DUPREE.XLS to see whether a statistically reliable oil consumption model can be estimated from the data.
• SomeAdditional Project Guidelines:
About 10 to 15 quality pages (concise)
Using the materials used in this course and beyond propose the best model
Using the automated variable selection procedures find the best model
Compare the above two models and propose your best possible model
Provide a summary of different models used and justify the best model
Define the parameters of the proposed model
Demonstrate the use of the model through examples
Would you recommend any explanatory variables to be omitted.
Would you recommend any other explanatory variables that could be added to the model?