Chapter 7 - K neighbours

7.1
a. How would this customer be classified?
A. This customer would be classified as not accepting the personal loan offer. According to the KNN_Output there appears to be overfitting due to the discrepancies in the classification matrix for training (Class 0 = 0% error, Class 1 = 0% error, Overall = 0% error), and validation error (Class 0 = 4.2% error, Class 1 = 55.85% error, and Overall = 9.1% error).

b. What is a choice of k that balances between overfitting and ignoring the predictor information?
A. A choice of k that balances between overfitting and ignoring the predictor would be k = 6. The value is chosen because it minimizes the % validation error. After testing various k levels. According to the validation error log for different k the best k points to 6, where %error training is 7.4% and validation % error is 8.75%.

c. Show the classification matrix for the validation data that results from using the best k.

d. Classify the customer using the best k
A. According to the best k the customer would not be inclined to accept the personal loan.
e. Re-partition the data, this time into training, validation, and test sets (50%: 30%: 20%). Apply the k-NN method with the k chosen above, compare the classification matrix of the test set with that of the training and validation sets. Comment on the differences and their reason.
A. Based on the training, validation, and test matrices we can see a steady increase in the percentage errors. There does not appear to be overfitting due to the minimal error discrepancies among all three matrices, from the training to the validation error there is a 5.69% difference, and from validation to test error there is a 14.05% error difference. Based on the lift chart, the model appears to make a difference even though the loan acceptance has a 82% error rate for the test classification matrix.
9.3
i. Compare the tree generated by the CT with the one generated by the RT. Are they

Chapter 7 - K neighbours

You May Also Find These Documents Helpful

3505 M2 Fall 2014 Soltn

3505 M2 Fall 2014 Soltn

U Decide

U Decide

Mgmt 520 Week 2 Assignment

Mgmt 520 Week 2 Assignment

Statistic Project Part "C"

Statistic Project Part "C"

Lawson Case

Lawson Case

MATH 533 Course Project Data AJ DAVIS

MATH 533 Course Project Data AJ DAVIS

MM207 Unit 2 Discussion Kimberly Pritchett

MM207 Unit 2 Discussion Kimberly Pritchett

Understanding Fico Scores

Understanding Fico Scores

ece 6001

ece 6001

HW 2

HW 2

Lab 6 Worksheet Essay Example

Lab 6 Worksheet Essay Example

Accounts Receivable and Income Statement

Accounts Receivable and Income Statement

Derivative and Heart Beats

Derivative and Heart Beats

Obstacles And Strengths

Obstacles And Strengths

Uploading and Downloading and City Download Speed

Uploading and Downloading and City Download Speed

Related Topics