Preview

Data Mining Bankruptcy Case

Good Essays
Open Document
Open Document
466 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Data Mining Bankruptcy Case
-------------------------------------------------
Tzu Han Hung (Vivian) CASE 2 1. Estimated profit by random selection
Expected spending per catalog mailed = 0.053 * $103 = $5.46
Expected Gross Profit by random select= (5.46-2)*180,000 = $622,800 2. a) We applied partition to “All_data” sheet, and partition output is shown in “Data_Partition1”
b) Logistic regression output can be seen in “LR_Output1”. Target variable is “purchase”. We select every variable except sequence_number(meaningless variable), source_w(removed from one of “source” variables because it is redundant), and spending (no meaning for target variable, purchase probability).
We choose the subset with 7 coefficients, since it has Cp value of 7.4 (closer to 7) as well as the probability greater than 10%. We applied the regression model to testing and validation dataset (output is in “LR_Output2”, “LR_Testscore2”, and “LR_ValidLiftChart2”). In testcore sheet, we can see the probability output we generated for each row from test data. Below shows the regression model and scoring summary.

3. a) the data of purchaser only is in “Purchasers_only” sheet b) Partition is shown in “Data_Partition2” sheet
c) Multiple Linear regression output can be seen in “MLR_Output1”. Target variable is “spending”. We select every variable except sequence_number(meaningless variable), source_w(removed from one of “source” variables because it is redundant), and purchase(all are 1 here).
d) To select best subset, the first criteria we consider is adjusted R square, finding the point where R square value stop improving, which is around 8 coefficients. Next we check Cp value, since Cp is not approaching the number of coefficient at all until more than 20 coefficient and Cp is our second criteria, we decided to choose 8 coefficients as our regression model, so that we can keep our simple model and avoid over-fitting problem. We applied the regression model to testing and validation dataset (output is

You May Also Find These Documents Helpful

  • Satisfactory Essays

    Acc/531 Week 4

    • 646 Words
    • 3 Pages

    was selected to estimate the multiple regression model, where y is the number of hours of television watched last week, x1 is the age (in years), x2 is the number of years of education, and x3 is income (in $1,000). The regression equation…

    • 646 Words
    • 3 Pages
    Satisfactory Essays
  • Satisfactory Essays

    Math 533 Part 3

    • 481 Words
    • 2 Pages

    2. Determine the equation of the "best fit" line, which describes the relationship between CREDIT BALANCE and SIZE.…

    • 481 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    Math203

    • 385 Words
    • 2 Pages

    Doucette wants to decide whether or not to put an employee retention program in place. But first, he wants Sarah Jenkins to check whether manager tenure and crew tenure are related to store profit. Accordingly, run the three regression models per instructions given below; data for these 3 models is in the worksheet labeled Data for Case A.…

    • 385 Words
    • 2 Pages
    Good Essays
  • Powerful Essays

    and the Advanced Placement are registered trademarks of the College Entrance Examination Board; their use does not constitute endorsement of this material by the College Board.…

    • 3908 Words
    • 16 Pages
    Powerful Essays
  • Satisfactory Essays

    The equation of the ‘best fit’ line or the regression equation is SALES(Y) = 9.638 + 0.2018 CALLS(X1)…

    • 1056 Words
    • 6 Pages
    Satisfactory Essays
  • Powerful Essays

    Bankruptcy Midterm

    • 3520 Words
    • 15 Pages

    Chapter 12 – provides for adjustment of debts of a “family farmer”, or a “family fisherman”.…

    • 3520 Words
    • 15 Pages
    Powerful Essays
  • Satisfactory Essays

    Age Gap Analysis

    • 896 Words
    • 4 Pages

    β2: An increase of living area by a hundred of square feet increases the selling price of home by 8884.48 dollars.…

    • 896 Words
    • 4 Pages
    Satisfactory Essays
  • Better Essays

    Bankruptcy Law - 1

    • 986 Words
    • 4 Pages

    One of the responsibilities as a paralegal is to conduct research. There is a list of basic and useful resources for practitioners and law students to utilize in researching a bankruptcy. The primary sources are Statues, Rules and Cases. The secondary Sources are Bankruptcy Treatises, Internet Sources and Research Databases also known as Bankruptcy Reporter Systems. All these resources are relevant to explain the process in which each is needed to research bankruptcy issues.…

    • 986 Words
    • 4 Pages
    Better Essays
  • Satisfactory Essays

    Week 3 Assignment 2

    • 718 Words
    • 3 Pages

    The coefficient is to determine how well the regression data fits the data. The square of R (0.832980642) is the degree of correlation between the dependable variable Y and Independent variable X.…

    • 718 Words
    • 3 Pages
    Satisfactory Essays
  • Better Essays

    2. Using Excel or other calculation software, input the data you collected in criterion one to calculate an estimated regression. Then, from the calculation provided, interpret the coefficient of determination, indicating how it will influence your decision to open the pizza business. Explain any additional variables that may improve the coefficient of…

    • 988 Words
    • 4 Pages
    Better Essays
  • Good Essays

    “The American republic will endure until the day congress discovers that it can bribe the republic with their money” as stated by Alexis de Tocqueville. We as Americans take many privileges for granted. We want so much and will do the least amount of work to get it-and our government does the same. Fraud, waste, and abuse has made this country engulf itself into self pity and has conned the government into thinking that this method is the only way for it to function. Continuous over sight will bring this country to its knees and we will be forced to succumb to the highest bidder.…

    • 664 Words
    • 3 Pages
    Good Essays
  • Better Essays

    Statistics Coursework

    • 816 Words
    • 4 Pages

    * The analysis is clear, informative, detailed and makes references to economic theory and to technical aspects of regression analysis;…

    • 816 Words
    • 4 Pages
    Better Essays
  • Powerful Essays

    BUS 475 Final Exam

    • 1754 Words
    • 7 Pages

    6) Sam's Used Cars uses the specific identification method of costing inventory. During March, Sam purchased three cars for $6,000, $7,500, and $9,750, respectively. During March, two cars are sold for $9,000 each. Sam determines that at March 31, the $9,750 car is still on hand. What is Sam’s gross profit for March?…

    • 1754 Words
    • 7 Pages
    Powerful Essays
  • Good Essays

    a) Using the data in Table 1, specify a linear functional form for the demand for Combination 1…

    • 474 Words
    • 2 Pages
    Good Essays
  • Satisfactory Essays

    Soci

    • 780 Words
    • 4 Pages

    2. Find the multiple regression equation. Interpret its meaning and the meaning of its slopes and constant.…

    • 780 Words
    • 4 Pages
    Satisfactory Essays