Final Submission BI Assignment

Submitted By:
Poulomi Pal 2012PGP078
Sukshit Kapur 2012PGP097
Mohit Dhami 2012PGP070
Ujjwal Shankar 2012PGP103
Vineet Jain 2012PGP061

Assumptions for the Assignment:
We have clubbed the fatalities and non-injuries in the MAX_SEV_IR into a single category i.e. 0 because we are interested in the class of injury.
We have included every predictor for running the different models except in case of tree where we ran random forest first and then ran tree. In doing so we zeroed upon the predictors of more interest.
Our class of interest is injury hence we have assumed the false negative rate to be substantially more expensive than false positive rate.
Output for Forest:
Here we use the mean decrease in the accuracy to define the predictors used for the classification of the data. More is the mean decrease in the accuracy, more important is the predictor for classification. Based on the output we select the important predictors to be used for running the tree. We have chosen the first 15 predictors with higher mean decrease in the accuracy.

Now we see the variable importance table to decide upon the important predictors.

Using the data from the above table we find that the below predictors are of our interest:
INJURY_CRASH FATALITIES NO_INJ_I PRPTYDMG_CRASH PED_ACC_R SPD_LIM VEH_INVL
REL_RWY_R STRATUM_R MANCOL_I_R RELJCT_I_R TRAF_CON_R WEATHER_R SUR_COUND
NON_INVL
Please see below, attached is the xml format of the output of forest

Using the above reduced data set we will run the tree.
When we run the tree on the reduced data set we get from the forest we get the below output:

Also when we check the rules of the tree we get below output:

Drawing the tree we see below:

Looking at the tree we can infer that the two most important predictor that can be used to classify the data are: INJURY_CRASH and NO_INJ_I
Now running the neural net and seeing its output we get the below:

The below gives the error matrix of the

Final Submission BI Assignment

You May Also Find These Documents Helpful

BC3020 Week 4 A

BC3020 Week 4 A

Bedsore and Nail Treatment Codes

Bedsore and Nail Treatment Codes

Unit 7; Pa110 Kaplan University

Unit 7; Pa110 Kaplan University

Active Rehabilitation Case Summary

Active Rehabilitation Case Summary

Math 540 Midterm

Math 540 Midterm

Math Midterm

Math Midterm

D1 Root Cause Analysis

D1 Root Cause Analysis

BIBL 350 Submission 3

BIBL 350 Submission 3

Never Event Policy Analysis

Never Event Policy Analysis

Atlantic Aquaculture

Atlantic Aquaculture

Blue Zuma Project for Greatest X-Games Scooter of This Generation.

Blue Zuma Project for Greatest X-Games Scooter of This Generation.

Quantitative Methods

Quantitative Methods

Decision Making Model Analysis Paper

Decision Making Model Analysis Paper

Healthcare Timeline

Healthcare Timeline

Speech Outline

Speech Outline

Related Topics