Preview

Decision Tree Induction & Clustering Techniques in Sas Enterprise Miner, Spss Clementine, and Ibm Intelligent Miner – a Comparative Analysis

Powerful Essays
Open Document
Open Document
6624 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Decision Tree Induction & Clustering Techniques in Sas Enterprise Miner, Spss Clementine, and Ibm Intelligent Miner – a Comparative Analysis
International Journal of Management & Information Systems – Third Quarter 2010

Volume 14, Number 3

Decision Tree Induction & Clustering Techniques In SAS Enterprise Miner, SPSS Clementine, And IBM Intelligent Miner – A Comparative Analysis
Abdullah M. Al Ghoson, Virginia Commonwealth University, USA

ABSTRACT Decision tree induction and Clustering are two of the most prevalent data mining techniques used separately or together in many business applications. Most commercial data mining software tools provide these two techniques but few of them satisfy business needs. There are many criteria and factors to choose the most appropriate software for a particular organization. This paper aims to provide a comparative analysis for three popular data mining software tools, which are SAS® Enterprise Miner, SPSS Clementine, and IBM DB2® Intelligent Miner based on four main criteria, which are performance, functionality, usability, and auxiliary Task Support. Keywords: Data mining, classification, decision tree, clustering, software evaluation, SAS Enterprise Miner, SPSS Clementine, IBM Intelligent miner, Comparative Analysis, evaluation criteria.

1.

INTRODUCTION

B

usinesses face challenges such as growth, regulations, globalization, mergers and acquisitions, competition, and economic changes, which require fast and good decisions rather than guess work. Taking good decisions requires accurate and clear analysis such as prediction, estimation, classification, or segmentation using data mining techniques. Decision tree induction and Clustering are two of the most important data mining techniques that find interesting patterns. There are many commercial data mining software in the market, and most of them provide decision trees induction and clustering data mining techniques. There is no doubt that commercial data mining software are expensive and costly, and choosing one of them is crucial and difficult decision. Therefore, this paper objective is to help



References: 1. 2. Berry, Michael J. A, and Gordon Linoff. “Data Mining Techniques: for marketing, sales, and customer support”. N.p.: John Wiley & Sons, Inc, 1997. Print. Jovanovic, N.; Milutinovic, V.; Obradovic, Z.; Foundations of Predictive Data Mining. Neural Network Applications in Electrical Engineering, 2002. NEUREL '02. 2002 6th Seminar on 26-28 Sept. 2002 Page(s):53 – 58 Berry, Michael J. A, and Gordon Linoff. Data Mining Techniques: for marketing, sales, and customer support. 2nd Edition, N.p.: John Wiley & Sons, Inc, 1997. p180-183. Print. Ajith Abraham, Swagatam Das,, and Amit Konar. "Automatic Clustering Using an Improved Differential Evolution Algorithm." IEEE Transactions On Systems, Man, And Cybernetics. 38.1 (2008): 218-236. Print. Castro, Vladimir Estivill. "Why so many clustering algorithms" SIGKDD Explorations”. 4.1 (2009): 65-75. Print. A. Ultsch, “Self Organizing Neural Networks perform different from statistical k-means clustering”. Retrieved December 6th, 2009, from http://www.mathematik.unimarburg.de/~databionics/downloads/papers/ultsch95kmeans.pdf Cabena, Peter. Discovering data mining. Prentice Hall, 1998. 78-79. Print. Collier, Ken etl. “A Methodology for Evaluating and Selecting Data Mining Software”, 32nd Hawaii International Conference on System Sciences, 1999, SAS Institute Inc. The SAS® Enterprise Intelligence Platform: SAS® Business Intelligence, 2008, retrieved in 2009 from http://www.sas.com/apps/whitepaper/index.jsp?cid=3596. Eric Hunley, SAS, Cary, NC. SAS Data Quality – A Technology Overview, SAS Inc., http://www2.sas.com/proceedings/sugi29/099-29.pdf. Randall Matignon, Data Mining Using SAS Enterprise Miner, retrieved in 2009from http://www.sasenterpriseminer.com. 69 3. 4. 5. 6. 7. 8. 9. 10. 11. International Journal of Management & Information Systems – Third Quarter 2010 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. Volume 14, Number 3 Fast, scalable predictive analytics for the enterprise,SAS® Data Mining Solutions, retrieved in 2009 from www.sas.com. SAS® Enterprise Miner™ for Desktop 6.1, retrieved in 2009from www.sas.com. Dave Norris, Clementine data mining workbench from SPSS, retrieved in 2009 from www.bloorresearch.com. Data Mining: Data Understanding and Data Preparation, SPSS Inc, retrieved in 2009 from www.vcu.edu. Data Mining:Modeling, SPSS Inc, retrieved in 2009 from www.vcu.edu. Peter Cabena, Hyun Hee Choi, Il Soo Kim, Shuichi Otsuka, Joerg Reinschmidt, Gary Saarenvirta Intelligent Miner for Data Applications Guide, retrieved in 2009 from www.ibm.com. Daniel S. Tkach, Information Mining with the IBM Intelligent Miner Family, retrieved in 2009 from www.ibm.com. Joerg Reinschmidt, Helena Gottschalk, Hosung Kim, Damiaan Zwietering, Intelligent Miner for Data:Enhance Your Business Intelligence. www.ibm.com. IBM DB2 Intelligent Miner Modeling Administration and Programming, retrieved in 2009 from www.ibm.com. IBM DB2 Intelligent Miner Modeling IBM DB2 Intelligent Miner ScoringData Mining with Easy Mining procedures, retrieved in 2009 from www.ibm.com. IBM DB2 Intelligent Miner VisualizationUsing the Intelligent Miner Visualizers, retrieved in 2009 from www.ibm.com. Data Mining:Modeling, SPSS Inc retrieved in 2009 from , www.vcu.edu. SAS Enterprise Miner Help files. N. Jovanovic, V. Milutinovic, and Z. Obradovic, Member, IEEE, „Foundations of Predictive Data Mining‟, 2002. SAS Enterprise Miner help files. Retreived in 2009. 70

You May Also Find These Documents Helpful

  • Better Essays

    One of the considered “best fine food stores” around is the Kudler Fine Foods. However, Kudler is in serious need of a network infrastructure upgrade of their old one. To introduce the latest technologies in data collection; company communication; and information protection while providing the best data speeds and network access; are the main goals of the enterprise network. This huge step is significant as this will increase the revenue and will reduce the costs of operation throughout the Kudler Fine Foods stores. Kudler Fine Foods will go back up to technological speed as the network upgrade is completed, while at the same time improving the way they keep track of inventory and sales by using data mining techniques, which will be collected and analyzed in real time.…

    • 1908 Words
    • 6 Pages
    Better Essays
  • Powerful Essays

    Crisp-Dm

    • 19391 Words
    • 78 Pages

    Foreword CRISP-DM was conceived in late 1996 by three “veterans” of the young and immature data mining market. DaimlerChrysler (then Daimler-Benz) was already ahead of most industrial and commercial organizations in applying data mining in its business operations. SPSS (then ISL) had been providing services based on data mining since 1990 and had launched the first commercial data mining workbench—Clementine®—in 1994. NCR, as part of its aim to deliver added value to its Teradata® data warehouse customers, had established teams of data mining consultants and technology specialists to service its clients’ requirements. At that time, early market interest in data mining was showing signs of exploding into widespread uptake. This was both exciting and terrifying. All of us had developed our approaches to data mining as we went along. Were we…

    • 19391 Words
    • 78 Pages
    Powerful Essays
  • Powerful Essays

    Baicoianu, A., & Dumitrescu, S. (2010). Data mining meets economic analysis: opportunities and challenges. Bulletin of the Transilvania University of…

    • 2553 Words
    • 11 Pages
    Powerful Essays
  • Powerful Essays

    Cis 500 Data Mining Report

    • 2046 Words
    • 9 Pages

    This report is an analysis of the benefits of data mining to business practices. It also assesses the reliability of data mining algorithms and with examples. “Data Mining is a process that uses statistical, mathematical, artificial intelligence, and machine learning techniques…

    • 2046 Words
    • 9 Pages
    Powerful Essays
  • Best Essays

    References: O 'Brien, J. A. & Marakas, G. M. (1999). Management Information Systems (9th edition). 190-…

    • 4105 Words
    • 17 Pages
    Best Essays
  • Satisfactory Essays

    This study takes an insight into the usage of data warehousing and data mining techniques to enhance the productivity of the business. The study of the processes is analysed so as to get the need of adaptation according to inherent demands of these industries in near future. The main topics we are discussing here are:…

    • 348 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    The data mining model chosen for this project is the Naïve Bayes classification model. This…

    • 642 Words
    • 3 Pages
    Good Essays
  • Better Essays

    CISA

    • 5575 Words
    • 29 Pages

    The process of data mining consists of two primary processes: data repository creation and deployment.…

    • 5575 Words
    • 29 Pages
    Better Essays
  • Best Essays

    It Essay - Data Mining

    • 1998 Words
    • 8 Pages

    Dharminder, K. (2011). Rise of Data Mining: Current and Future Application Areas. International Journal of Computer Science Issues, 8(5), 256-260. Retrieved November 7, 2012, from http://www.ijcsi.org/papers/IJCSI-8-5-1-256-260.pdf…

    • 1998 Words
    • 8 Pages
    Best Essays
  • Powerful Essays

    Data Mining

    • 1921 Words
    • 8 Pages

    Patterson, L. (2010, APR 27). The nine most common data mining techniques used in predictive…

    • 1921 Words
    • 8 Pages
    Powerful Essays
  • Best Essays

    Mannila, H. (2002). Combining pattern discovery and probabilistic modeling in data mining. In: PENTTONEN, M. & SCHMIDT, E. M., eds., Jul 03-05 2002 Turku, Finland. Springer-Verlag Berlin, 10-19. Mierswa, M. W., Klingkenberg, R., Scholz, M., and Euler, T. (2009). RapidMiner 4.3 Tutorial. Mierswa, I., Wurst, M., Klinkenberg, R, Scholz, M., and Euler, T. (2006). YALE: Rapid Prototyping for Complex Data Mining Tasks. Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-06), August, 935--940. Morris, J. L. S. (ed.) 2001. Online and Personal: The reality of Internet relationships, Sydney: Finch Publishing Mykola Pechenizkiy, S. P., Alexey Tsymbal (1998). On the Use of Information Systems Research Methods in Data Mining. Information Systems Development: Advances in Theory, Practise and Education. Neuman, W. L. (2003). Social Research Methods: Qualitative and Quantitative Approaches, Allyn and Bacon. Newman, G. R. (2005). Identity Theft Literature Review. In: Justice, U. D. O. (ed.). Park, Y. J., Choi, E., and Park, S. H. (2009). Two-step filtering datamining method integrating casebased reasoning and rule induction. Expert Systems with Applications, 36, 861-871. Piatetsky-Shapiro, G., Smyth, P and Uthurusamy, R. (1994). KDD-93:Progress and Challenges in Knowledge Discovery in Databases. AI Magazine, 15. Piatetsky-Shapiro, G., and Smyth, P. (1996). From Data Mining to Knowledge Discovery in Databases. American Association for Artificial Intelligence. Silberstein, L. (2009). e Love scams [Online]. Available: http://www.elovedeceptions.com/ [Accessed]. Silberschatz, A. T. (1995). On Subjective Measures of Interestingness in Knowledge Discovery. Proceedings of the First International Conference on AAAI. Silberschatz, A. T. (1996). What makes patterns interesting in Knowledge Discovery systems.…

    • 5113 Words
    • 21 Pages
    Best Essays
  • Good Essays

    The KDD-CUP-98 challenge is related to creation of a model trained and tested on historical data and capable of providing a prediction on the potential donors so as to maximise profit . It will provide a good mailing list so as to target only valuable customers . Typically the existing models predict future response behaviour . The historical database has information about mailing campaigns in the past and the response of customers and the collected dollar amount . The model should predict current customers who are likely to respond and maximize net profit…

    • 906 Words
    • 4 Pages
    Good Essays
  • Powerful Essays

    Tic-Tac-Toe - Data Mining

    • 1778 Words
    • 8 Pages

    Data Mining – A tutorial based primer, Richard J. Riger and Michael W. Geatz, Second impression 2008, Pearson Education Inc.…

    • 1778 Words
    • 8 Pages
    Powerful Essays
  • Powerful Essays

    Association Rule Mining using Apriori Algorithm Ghanshyam Verma, Shruthi Varadhan Computer Technology Department KITS-Ramtek, Nagpur-441106 gs.verma@live.com shruthivaradhan@gmail.com Abstract— Data mining is the process that results in the discovery of new patterns in large data sets. Data mining involves six common classes of tasks: Anomaly detection, Association Rule Mining, clustering, classification, regression and summarization. Association rule learning is a popular and well researched method for discovering interesting relations between variables in large databases.…

    • 3016 Words
    • 13 Pages
    Powerful Essays
  • Powerful Essays

    data mining IEEE format

    • 10012 Words
    • 41 Pages

    Abstract: This topic is mainly used by a number of data mining techniques, such as clustering, nearest neighbor classification, and anomaly detection. And it can also include the data mining applications.In this paper we have focused a variety of techniques, approaches and different areas of the research which are helpful and marked as the important field of data mining Technologies. As we are aware that many MNC’s and large organizations are operated in different places of the different countries. Each place of operation may generate large volumes of data. Corporate decision makers require access from all such sources and take strategic decisions. In an uncertain and highly competitive business environment, the value of strategic information systems such as these are easily recognized however in today’s business environment, efficiency or speed is not the only key for competitiveness. This type of huge amount of data’s is available in the form of tera- to peta-bytes which has drastically changed in the areas of science and engineering.…

    • 10012 Words
    • 41 Pages
    Powerful Essays

Related Topics