Decision

Lab 1: Decision Trees and Decision Rules

Evgueni N. Smirnov smirnov@cs.unimaas.nl August 21, 2010

1. Introduction Given a data-mining problem, you need to have data that represent the problem, models that are suitable for the data, and of course a data-mining environment that contains the algorithms capable of learning these models. In this lab you will study two well-known classification problems. You will try to find classification models for these problems using decision trees and decision rules. The algorithms to learn these models are given in Weka, a data-mining environment that accompanies our course. You will study the explorer part of Weka to learn how to call decision-tree and decision-rule algorithms, how to evaluate the accuracy of the learned models, and how to use reduced error pruning.

2. Concept-Learning Problems In this lab you are expected to build classification models for two classification problems: • Labor-negotiation problem; • Soybean classification problem.

The data files for all the two problems are provided in the directory:

http://www.unimaas.nl/datamining/UCI/datasets-UCI.zip

3. Environment As stated above to build the desired classification models you will use Weka. Weka is a data-mining environment that contains a collection of machine-learning algorithms for solving real-world data-mining problems. The algorithms can either be applied directly or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes. Weka is open source software issued under the GNU General Public License.

4. Algorithms To build the classifiers you will use four learning algorithms provided in Weka: 1. zeroR is a majority/average predictor. It assigns to each instance the classification of the

Decision

You May Also Find These Documents Helpful

Scor eStore.com

Scor eStore.com

You Decide

You Decide

decision

decision

DATA CLUSTERING

DATA CLUSTERING

Data Mining The Mushroom Database

Data Mining The Mushroom Database

Establishing a Center of Excellence for Data Mining

Establishing a Center of Excellence for Data Mining

Final Submission BI Assignment

Final Submission BI Assignment

Automatic Emotion Recognition from Speech Using Reduced Feature Set & Different Classifiers

Automatic Emotion Recognition from Speech Using Reduced Feature Set & Different Classifiers

Cloud Burst

Cloud Burst

Assgn

Assgn

decision

decision

Bayseian Classifier Implementation

Bayseian Classifier Implementation

School Fees and Cash Flow Tracking System

School Fees and Cash Flow Tracking System

Survey on Secure Data mining in Cloud Computing

Survey on Secure Data mining in Cloud Computing

Thesis Proposal for Ncae

Thesis Proposal for Ncae

Related Topics