Data Mining Bankruptcy Case

Good Essays

-------------------------------------------------
Tzu Han Hung (Vivian) CASE 2 1. Estimated profit by random selection
Expected spending per catalog mailed = 0.053 * $103 = $5.46
Expected Gross Profit by random select= (5.46-2)*180,000 = $622,800 2. a) We applied partition to “All_data” sheet, and partition output is shown in “Data_Partition1”
b) Logistic regression output can be seen in “LR_Output1”. Target variable is “purchase”. We select every variable except sequence_number(meaningless variable), source_w(removed from one of “source” variables because it is redundant), and spending (no meaning for target variable, purchase probability).
We choose the subset with 7 coefficients, since it has Cp value of 7.4 (closer to 7) as well as the probability greater than 10%. We applied the regression model to testing and validation dataset (output is in “LR_Output2”, “LR_Testscore2”, and “LR_ValidLiftChart2”). In testcore sheet, we can see the probability output we generated for each row from test data. Below shows the regression model and scoring summary.

3. a) the data of purchaser only is in “Purchasers_only” sheet b) Partition is shown in “Data_Partition2” sheet
c) Multiple Linear regression output can be seen in “MLR_Output1”. Target variable is “spending”. We select every variable except sequence_number(meaningless variable), source_w(removed from one of “source” variables because it is redundant), and purchase(all are 1 here).
d) To select best subset, the first criteria we consider is adjusted R square, finding the point where R square value stop improving, which is around 8 coefficients. Next we check Cp value, since Cp is not approaching the number of coefficient at all until more than 20 coefficient and Cp is our second criteria, we decided to choose 8 coefficients as our regression model, so that we can keep our simple model and avoid over-fitting problem. We applied the regression model to testing and validation dataset (output is

Data Mining Bankruptcy Case

You May Also Find These Documents Helpful

Acc/531 Week 4

Acc/531 Week 4

Math 533 Part 3

Math 533 Part 3

Math203

Math203

United States Bankruptcy Law and Java Methods Answers

United States Bankruptcy Law and Java Methods Answers

PROJECT PART C: Regression and Correlation Analysis

PROJECT PART C: Regression and Correlation Analysis

Bankruptcy Midterm

Bankruptcy Midterm

Age Gap Analysis

Age Gap Analysis

Bankruptcy Law - 1

Bankruptcy Law - 1

Week 3 Assignment 2

Week 3 Assignment 2

Making Decisions Based on Demand and Forecasting

Making Decisions Based on Demand and Forecasting

Bankruptcy in America: Fraud, Waste, Abuse and Federal Deficit

Bankruptcy in America: Fraud, Waste, Abuse and Federal Deficit

Statistics Coursework

Statistics Coursework

BUS 475 Final Exam

BUS 475 Final Exam

Project of Demand Estimation

Project of Demand Estimation

Soci

Soci

Related Topics

Data Mining Bankruptcy Case

You May Also Find These Documents Helpful

Acc/531 Week 4

Acc/531 Week 4

Math 533 Part 3

Math 533 Part 3

Math203

Math203

United States Bankruptcy Law and Java Methods Answers

United States Bankruptcy Law and Java Methods Answers

PROJECT PART C: Regression and Correlation Analysis

PROJECT PART C: Regression and Correlation Analysis

Bankruptcy Midterm

Bankruptcy Midterm

Age Gap Analysis

Age Gap Analysis

Bankruptcy Law - 1

Bankruptcy Law - 1

Week 3 Assignment 2

Week 3 Assignment 2

Making Decisions Based on Demand and Forecasting

Making Decisions Based on Demand and Forecasting

Bankruptcy in America: Fraud, Waste, Abuse and Federal Deficit

Bankruptcy in America: Fraud, Waste, Abuse and Federal Deficit

Statistics Coursework

Statistics Coursework

BUS 475 Final Exam

BUS 475 Final Exam

Project of Demand Estimation

Project of Demand Estimation

Soci

Soci

Related Topics

Report this documents

Please chosse a reason

You'll be redirected