Preview

Business Intelligence and Data Mining - Decision Trees

Good Essays
Open Document
Open Document
906 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Business Intelligence and Data Mining - Decision Trees
INDIAN INSTITUTE OF MANAGEMENT, INDORE
Post Graduate Programme – Term IV – AY 20012-13
Business Intelligence And Data Mining
Group Assignment on NGO Donations Maximization

Abstract

The problem is associated to devising a strategy to maximize the profits from a Direct Marketing Campaign to a selected group of customers while minimizing costs . The exercise requires the use of Business Intelligence tools and techniques to build a model , trained and tested on the historical data for the last year’s donation raising campaign . From this model it should be possible to predict the profitability of a prospective donor , hence allowing a more targeted campaign at lower cost . The difficulty is due to extremely imbalanced data and the inverse correlation between the probability of response and the dollar amount generated from it . The available data set and problem is of the KDD-CUP-98 challenge . The solution would be applicable to any direct marketing campaign which has historical data available .

Table of Contents Introduction 4 Performance Based Management 4 Balanced Scorecard 4 Problem in implementation of BSC 8 Literature Review 8 Company Name: Cipla 10 Introduction of the company 10 History 11 Vission & Mission of Cipla 12 Scorecard for Cipla 12 Market 12 Culture 12 Internal 13 R&D 13 Key Learning 15 Outcome/Conclusion 16 References 16

Introduction

The KDD-CUP-98 challenge is related to creation of a model trained and tested on historical data and capable of providing a prediction on the potential donors so as to maximise profit . It will provide a good mailing list so as to target only valuable customers . Typically the existing models predict future response behaviour . The historical database has information about mailing campaigns in the past and the response of customers and the collected dollar amount . The model should predict current customers who are likely to respond and maximize net profit



References: 16 Introduction The KDD-CUP-98 challenge is related to creation of a model trained and tested on historical data and capable of providing a prediction on the potential donors so as to maximise profit . It will provide a good mailing list so as to target only valuable customers . Typically the existing models predict future response behaviour . The historical database has information about mailing campaigns in the past and the response of customers and the collected dollar amount . The model should predict current customers who are likely to respond and maximize net profit ( Donation amount – Mailing cost ) over the contacted customers . The records are from the results of the 1997 Paralyzed Veterans of America fundraising mailing campaign and only 5% records are responders . Thus classification with response value can give 95% accuracy . An approach in ranking customers by estimated probability to respond and selecting top portion , if top 5% of the list contains 30% of responders and hence a lift of 6 , but the drawback is not using the donation amount for the customer . Here there is an inverse correlation between probability to donate and dollar amount as the donors donating higher amount are more cautious . Therefore probability based ranking tends to rank down valuable customers . Another method which adapts accuracy to cost-sensitive learning tries to minimize cost but since the initial list considers probability of response and then considers profitability , tends to ignores valuable consumers who are usually infrequent . The tweaked use of association rules leads to better result then the above suggested methods . It involves the identification of subsets of attributes which are correlated to “respond class” and then a small subset of generated association rules to identify potential customers in the current campaign . The solution tries to increase customer value by selecting association rules and increase profitability over the current customers . Negative association rules may also suggest , given some attributes the chances of not donating . The association rules do not tell how to maximize an objective function especially when there is inverse correlation . The dataset has 191,799 records of customers contacted in the 1997 mailing campaign . Each record has 479 non-target variables and two target variables indicating respond / not_respond and actual donation in dollars . 5% records are respond records and dataset is split into 50% for learning and 50% for validation . The customers are to be evaluated and predicted based on a mailing cost of $0.68 .The inverse correlation could exist in offering for the same customer which can be reduced by avoiding multiple mailings within a time period or for different customers meaning many small contributions and few big customers . The second type of inverse correlation has to be addressed . It can be done in two steps obtain probability estimation from decision trees and re-rank it using customer value , but this also ignores the value in the first step . The other problem is high dimensionality , having 481 variables and small target population leading to difficulty in identifying features for respond class . The one attribute at a time “ gain criterion “ does not search for correlated variables although it is good for maximising class probability but not when non-maximum class probability is also used for ranking customers .The notion of focussed association rules leads to features typical of response class and not of not_respond class i.e. a subset of variables in the respond class which occur infrequently in the not_respond class . This leads to data pruning of not_respond class leading to solution to scarcity of data in target class and also removal variables that are frequent in the non_respond class . The focussed association rules can then be converted into a model for predicting the donation amount for a customer by trying to cover customers using these rules and pruning over-fitting rules and estimating donation amount for rules . The assumption is that current customers follow the same class and donation distribution as that of historical records . Rule Generation ,finds a set of good rules that capture features of responders , Model Building combines rules into prediction model for donation amount and Model Pruning prunes rules that do not generalize to the entire population . Our Approach

You May Also Find These Documents Helpful

  • Better Essays

    One of the considered “best fine food stores” around is the Kudler Fine Foods. However, Kudler is in serious need of a network infrastructure upgrade of their old one. To introduce the latest technologies in data collection; company communication; and information protection while providing the best data speeds and network access; are the main goals of the enterprise network. This huge step is significant as this will increase the revenue and will reduce the costs of operation throughout the Kudler Fine Foods stores. Kudler Fine Foods will go back up to technological speed as the network upgrade is completed, while at the same time improving the way they keep track of inventory and sales by using data mining techniques, which will be collected and analyzed in real time.…

    • 1908 Words
    • 6 Pages
    Better Essays
  • Good Essays

    Kudler has different types of options on how they could advertise the new shoppers program. Because these types of programs are commonplace within any market, Kudler needs to develop a complete database system, with including current and former customers. This database is used specifically for advertising of the new…

    • 907 Words
    • 4 Pages
    Good Essays
  • Good Essays

    1. Identify all the accounting policy changes and accounting estimates that Harnischfeger made during 1984. Estimate as accurately as possible the effect of these on the company’s 1984 reported profits.…

    • 634 Words
    • 3 Pages
    Good Essays
  • Good Essays

    Bus 155

    • 658 Words
    • 3 Pages

    You own your own small party supply and rental business. You maintain an Excel list of potential customers and clients who have rented or purchased from you in the past two years. The list includes both physical mailing addresses and e-mail addresses for each person listed. You have a “category“ code to show whether the contact is a client. If they have done business with you, they show a code of “CUST” for customer. If they are potential customers who have not yet purchased or rented from you, they are coded as “PROS” for prospect.…

    • 658 Words
    • 3 Pages
    Good Essays
  • Powerful Essays

    Groupon Marketing Plan

    • 4795 Words
    • 21 Pages

    Groupon, Incorporated (Groupon) is a national e-commerce marketplace that connects local merchants to local consumers by offering goods and/or services at a discount. Each day Groupon e-mails its subscribers discounted offers for goods and services that are targeted by demographics and/or filtered by personal preference. National merchants also have used the company’s marketplace as an alternative to traditional marketing and brand advertising. Currently, Groupon does most of its marketing and advertising on the internet and on the following cellular phones: Android, Blackberry, and the iPhone. After conducting research, it appears Groupon has a niche market, which consists of individuals who have expensive phones and/or are computer savvy. Because a large amount of people use different coupons and may not be computer savvy, my team is putting together a marketing plan that will allow for Groupon to increase its marketability with an older generation and enhance the marketing plan they currently utilize to attract these customers.…

    • 4795 Words
    • 21 Pages
    Powerful Essays
  • Satisfactory Essays

    Essay Rough Draft

    • 572 Words
    • 2 Pages

    been able to classify the frequency of a customers visits by their behaviors into three categories:…

    • 572 Words
    • 2 Pages
    Satisfactory Essays
  • Better Essays

    Kool-Aid

    • 1080 Words
    • 5 Pages

    Target marketing is a marketing mix that is tailored to fit some specific target customers. In 2011, a multi-tiered campaign was launched to reach families across all platforms. Kool-Aid has stood for fun and refreshment for generations (Kool-Aid Sets Out to 'Bring Back Family Fun ', 2011). The idea for this campaign is to bring back the idea of “Family Fun”. In order to reach customers of all platforms they launched a Facebook campaign, a Kool-Aid sweepstakes, and a movie night. The sweepstakes allowed customers to enter for…

    • 1080 Words
    • 5 Pages
    Better Essays
  • Best Essays

    Kudler is looking for ways to increase sales and customer satisfaction. To achieve this goal Kudler will use data mining tools to predict future trends and behaviors to allow them to make proactive, knowledge-driven decisions. Kudler’s marketing director has access to information about all of its customers: their age, ethnicity, demographics, and shopping habits. The starting point will be a data warehouse containing a combination of internal data tracking all customers contact coupled with external market data about competitor activity. Background information on potential customers also provides an excellent basis for prospecting.…

    • 1512 Words
    • 7 Pages
    Best Essays
  • Powerful Essays

    Week 10 Assignment

    • 1474 Words
    • 5 Pages

    Especially for You Jewelers is a small jewelry company in a college town. Over the last couple of years, it has experienced a tremendous increase in its business. However, its financial performance has not kept pace with its growth. The current system, which is partly manual and partly automated, doesn’t track accounts receivables sufficiently, and the company is finding it difficult to determine the reasons why the receivables are so high. The company runs frequent specials to attract customers but has no idea whether these efforts are profitable or if the benefit—if there is one—comes from associated sales. Especially for You Jewelers wants to increase repeat sales to its existing customers; thus, it needs to develop a customer database. It also wants to install a new direct sales and accounting system to help solve the outlined problems.…

    • 1474 Words
    • 5 Pages
    Powerful Essays
  • Good Essays

    Ace Auto Dealer obtains customers from many different advertising avenues. Information about the customers is currently saved, including the customer’s name, address, phone number, date of visit, and the make and model of the vehicle in which the customer is interested. Unfortunately, the data that is currently being saved does not indicate if the customer has made a purchase and from which advertising avenue the customer came. This negatively impacts the business in many ways. First, sales representatives find themselves following up with customers who are no longer in the market for a vehicle. Also, the most effective advertising channel cannot be identified.…

    • 455 Words
    • 2 Pages
    Good Essays
  • Satisfactory Essays

    qat1task5

    • 270 Words
    • 2 Pages

    By developing the likely revenue of market response outcome and summing the results, we obtain the expected…

    • 270 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    Markstrat Final Report

    • 2968 Words
    • 9 Pages

    Successful use of segmentation and positioning resulted in an effective targeting strategy. The R&D and advertising decisions made were essential in growing contribution. These…

    • 2968 Words
    • 9 Pages
    Good Essays
  • Good Essays

    Billy Budd, by author Herman Melville has many unique characters, allusions, and symbolism to make reading such a story a challenging adventure. One of those unique characters being Captain ‘Starry’ Vere. In Billy Budd Melville portrays a very sui generis perspective of Vere, that of one who is a worthy leader, but is unfortunately trapped by the law of his homeland. Melville tells readers that Captain Vere was well liked both as a sailor and as an ordinary land walker. He was well known for showing people that sailors can live a different lifestyle other than such who dwells on a boat.…

    • 603 Words
    • 3 Pages
    Good Essays
  • Powerful Essays

    Airbus A3Xx

    • 8265 Words
    • 34 Pages

    2. Analysis Of Changes In Operating Margin Against Changes In Steady State Number of Planes…

    • 8265 Words
    • 34 Pages
    Powerful Essays
  • Good Essays

    Statistics Business Paper

    • 2873 Words
    • 12 Pages

    Unite Automobile Enterprise plan to increase their sales in the upcoming year. The data and statistics that have been collected from previous customers will help determine the course of action that Unite will take when planning their new advertising campaign. With Unite’s limited advertising budget, the need to optimize the effects of their only marketing campaign is essential to securing profits for the forthcoming year. The research information will be comprised of many different variables collected from Unite’s past year’s customers. There are 80 samples collected from previous customer’s including: customer age demographic, the amount of money that a customer of each age demographic was willing to spend on a car, and the type of car; import or domestic, preferred by customer age. By researching this data, the company hopefully will be able to design an effective marketing campaign to successfully draw new customers to the company within the age demographic that has been determined to be the target audience by the majority of money that had been spent by that group.…

    • 2873 Words
    • 12 Pages
    Good Essays