CLUSTER ANALYSIS: ALGORITHMS AND ANALYSIS USING SAS

Powerful Essays

CLUSTER ANALYSIS:
ALGORITHMS AND ANALYSIS USING SAS

BY: AHMED ALDAHHAN
SUPERVISED BY: LECTURER JING XU
BIRKBECK UNIVERSITY OF LONDON
2013/2014

ABSTRACT
The scope of this paper is to provide an introduction to cluster analysis; by giving a general background for cluster analysis; and explaining the concept of cluster analysis and how the clustering algorithms work. A basic idea and the use of each clustering method will be described with its graphical features. Different clustering techniques are also explained with examples to get a better idea. The two main clustering techniques (Hierarchical and K-means Partitioning) are illustrated using a sample data set ‘IRIS FLOWER DATA
SET’ (1936), where a comparison of the two methods is made based on data suitability and model performance. TABLE OF CONTENTS
CHAPTER 1
1.0

Introduction …………………………………………………………………………………………………….. 5

1.1

Understanding Cluster Analysis ……………..……………………….……………………………….. 7

CHAPTER 2
2.0

Definitions …………………………………………………………………………..………………………..… 9

2.1

The Data Matrix ………………………………………………………..…….…………………………….… 9

2.2

The Proximity matrix ………………………………………………………………….……………………. 9

2.3

Similarity and Dissimilarity Matrices ………..………………..………………………………..…. 11

2.4

Different Types of Clusters ……………………………………………………………………..………. 11
2.4.1
2.4.2

Centre-Based cluster definition ……………………………………………………………… 12

2.4.3

Contiguity-Based Cluster Definition ………………………………………………….……. 13

2.4.4

Density-Based Clusters definition …………………………………………………..……… 13

2.4.5
2.5

Well-Separated cluster definition ………………………………………………………….. 11

Shared-Property ( Conceptual Clusters ) ………………………………………………... 14

Distance Matrix ………………………………………………………………………………………………. 14

2.6 Hierarchical Clustering …………………………………………….………………………………….….… 16
2.6.1

Agglomerative Hierarchical Clustering …………………………………………………... 16

2.6.2

Divisive Hierarchical Clustering

Cited: DEC 13

CLUSTER ANALYSIS: ALGORITHMS AND ANALYSIS USING SAS

You May Also Find These Documents Helpful

A3 5 AppliedStatistics

A3 5 AppliedStatistics

What Does Avogadro's Number Mean

What Does Avogadro's Number Mean

Bsc303 Chapter 1 Study Guide

Bsc303 Chapter 1 Study Guide

Equal Interval Classification Method: A Case Study

Equal Interval Classification Method: A Case Study

Diversity: Race and Desktop Research

Diversity: Race and Desktop Research

The Fuzzy C-Means (FCM) Clustering

The Fuzzy C-Means (FCM) Clustering

Miss

Miss

Web Analytics

Web Analytics

Clusters

Clusters

An Overview of Cluster Theory and Industry Policy in Australia

An Overview of Cluster Theory and Industry Policy in Australia

Dairy Sector of Pakistan

Dairy Sector of Pakistan

An approach for segmentation of medical images using pillar K-means algorithm

An approach for segmentation of medical images using pillar K-means algorithm

ARMOPSO Lab Report

ARMOPSO Lab Report

Crowdedness Spot Case Study

Crowdedness Spot Case Study

Economic Geography

Economic Geography

Related Topics