Preview

CLUSTER ANALYSIS: ALGORITHMS AND ANALYSIS USING SAS

Powerful Essays
Open Document
Open Document
10565 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
CLUSTER ANALYSIS: ALGORITHMS AND ANALYSIS USING SAS
CLUSTER ANALYSIS:
ALGORITHMS AND ANALYSIS USING SAS

BY: AHMED ALDAHHAN
SUPERVISED BY: LECTURER JING XU
BIRKBECK UNIVERSITY OF LONDON
2013/2014

ABSTRACT
The scope of this paper is to provide an introduction to cluster analysis; by giving a general background for cluster analysis; and explaining the concept of cluster analysis and how the clustering algorithms work. A basic idea and the use of each clustering method will be described with its graphical features. Different clustering techniques are also explained with examples to get a better idea. The two main clustering techniques (Hierarchical and K-means Partitioning) are illustrated using a sample data set ‘IRIS FLOWER DATA
SET’ (1936), where a comparison of the two methods is made based on data suitability and model performance. TABLE OF CONTENTS
CHAPTER 1
1.0

Introduction …………………………………………………………………………………………………….. 5

1.1

Understanding Cluster Analysis ……………..……………………….……………………………….. 7

CHAPTER 2
2.0

Definitions …………………………………………………………………………..………………………..… 9

2.1

The Data Matrix ………………………………………………………..…….…………………………….… 9

2.2

The Proximity matrix ………………………………………………………………….……………………. 9

2.3

Similarity and Dissimilarity Matrices ………..………………..………………………………..…. 11

2.4

Different Types of Clusters ……………………………………………………………………..………. 11
2.4.1
2.4.2

Centre-Based cluster definition ……………………………………………………………… 12

2.4.3

Contiguity-Based Cluster Definition ………………………………………………….……. 13

2.4.4

Density-Based Clusters definition …………………………………………………..……… 13

2.4.5
2.5

Well-Separated cluster definition ………………………………………………………….. 11

Shared-Property ( Conceptual Clusters ) ………………………………………………... 14

Distance Matrix ………………………………………………………………………………………………. 14

2.6 Hierarchical Clustering …………………………………………….………………………………….….… 16
2.6.1

Agglomerative Hierarchical Clustering …………………………………………………... 16

2.6.2

Divisive Hierarchical Clustering



Cited: DEC 13

You May Also Find These Documents Helpful

  • Good Essays

    A3 5 AppliedStatistics

    • 1129 Words
    • 8 Pages

    In this activity you will collect data and then perform statistical analyses to determine measures of central tendency and variation of the data. You will also represent the data using a histogram.…

    • 1129 Words
    • 8 Pages
    Good Essays
  • Good Essays

    Below I will be discussing the results of each method and its accuracy and the errors and assumptions associated with each method.…

    • 1478 Words
    • 6 Pages
    Good Essays
  • Powerful Essays

    Bsc303 Chapter 1 Study Guide

    • 4685 Words
    • 19 Pages

    Data Mining- the process of searching huge amounts of data with the hope of finding a pattern…

    • 4685 Words
    • 19 Pages
    Powerful Essays
  • Good Essays

    The Equal Interval classification and Quantiles method made different spatial patterns on the maps. The Quantiles classification method had produced closer spatial patterns than the spatial patterns the Equal Interval classification method had produced.…

    • 136 Words
    • 1 Page
    Good Essays
  • Better Essays

    A) observation method: it’s that work of observation which carried by going out to the streets and observe the people who living in the community, and the observation should be done in a particular place and in a specified timescale by using a check list or tick box sheet to end-up noticing how the observed people differs on the bases of several specific qualities in order to categorise them.…

    • 1302 Words
    • 6 Pages
    Better Essays
  • Satisfactory Essays

    To ensure the best care of a patient while he or she goes through the various stages of cancer, it is necessary to make reliable and accurate decisions in oncological prognosis. Several prognostic markers have emerged over the recent years which can be used as indicators to mark the progression of the disease. One such marker is the mammogram image which displays the tumor present in the breast. These images are black and white in colour and many times hard to read. The main purpose of this paper is to investigate the fuzzy c-means(FCM) clustering as a fuzzy logic method to increase the acutance through segmentation of two mammogram images of the same patient via clustering.…

    • 116 Words
    • 1 Page
    Satisfactory Essays
  • Best Essays

    Miss

    • 1401 Words
    • 6 Pages

    |know the general trends illustrating the way grouping methods are used in secondary schools; |…

    • 1401 Words
    • 6 Pages
    Best Essays
  • Good Essays

    Web Analytics

    • 1128 Words
    • 5 Pages

    Advanced Segmentation: You can trim and compile your data in the way you want. This helps you to visualize data the way you want. In our case we might want to see which…

    • 1128 Words
    • 5 Pages
    Good Essays
  • Powerful Essays

    Clusters

    • 5579 Words
    • 23 Pages

    In dit paper ga ik in op de clustertheorieën die binnen de economische geografie een steeds vooraanstaander plaats aan het innemen zijn. Hiervoor bespreek ik eerst de tegenstelling tussen mondialisering en lokalisering. Clusters benadrukken namelijk opnieuw het belang van het lokale, terwijl het belang van het mondiale lange tijd het dominante discourse vormde. Het is de overgang van het fordistische naar het postfordische productieprocess…

    • 5579 Words
    • 23 Pages
    Powerful Essays
  • Powerful Essays

    This paper examines the potential of cluster theory to inform industry policy. In the economic sense, clusters are production networks of strongly interdependent firms linked to each other in a value-adding production chain (Roelandt et al 1999). A defining characteristic of clusters is the presence of positive externalities which enhance firm competitiveness and stimulate innovation. High geographical concentrations of business activity not only intensify competition but also promote collaboration. Theoretical explanations of clustering focus on various aspects such as economies of scale, proximity to markets and supplier networks, and access to highly skilled human capital. From a policy perspective, cluster theory has a number of strengths and weaknesses. The strengths include a greater focus on interdependencies; better alignment of policies with the true nature of business; reduced trade-distortion; and greater transparency. Among the weaknesses perhaps the most serious is the potential for clusters to retard innovation, under certain circumstances, rather than promote it. While many of the concepts have been around for a while, new economic geography has brought to awareness of some important phenomena in modern business that had been largely neglected in the past (Schmutzler 1999). Cluster-based industry policies are common in many developed nations across the OECD. The three types of policy instruments used by governments in the implementation of cluster programs include the engagement of actors; provision of collective services; and promotion of larger-scale collaborative research and development projects (OECD 2007). Australia, however, has very few clusters by international standards, and even fewer active cluster development policies (Marceau 1999). There are several legitimate rationales for government…

    • 5812 Words
    • 24 Pages
    Powerful Essays
  • Powerful Essays

    Dairy Sector of Pakistan

    • 4855 Words
    • 20 Pages

    * Can Government Catalyze Clusters? Examples Of Government Actions , Prepared for the 6th Annual International Conference of The Competitiveness Institute, Gothenburg, Sweden, September, 2003 by Indira Singh.…

    • 4855 Words
    • 20 Pages
    Powerful Essays
  • Powerful Essays

    System with 3D Color Vector Quantization and Clusterbased Shape and Structure Features”, The 19th EuropeanJapanese Conference on Information Modelling and…

    • 3197 Words
    • 13 Pages
    Powerful Essays
  • Satisfactory Essays

    ARMOPSO Lab Report

    • 1511 Words
    • 7 Pages

    In data clustering, we choose 5 artificial datasets and 9 UCI datasets to test the performance of ARMOPSO. Table 4 shows the specific information of these datasets.…

    • 1511 Words
    • 7 Pages
    Satisfactory Essays
  • Good Essays

    In the research work [3], Jörg Sander, Martin Ester, Hans-Peter Kriegel, Xiaowei Xu introduce The clustering algorithm DBSCAN depend on a density-based notion of clusters and is propose to find clusters of arbitrary shape as well as to analyse noise. In this paper, algorithm establish in two important aspect. The generalized algorithm – is known as GDBSCAN – used to cluster point objects as well as spatially widespread objects allowing to both, their spatial and their non-spatial attributes.…

    • 841 Words
    • 4 Pages
    Good Essays
  • Powerful Essays

    Economic Geography

    • 1469 Words
    • 6 Pages

    Think of cluster, we think about interconnected businesses working together in a region in a particular field. So being a part of cluster allows individuals or companies to operate more productivity and efficiency. Cluster is a typical place where receive so many support and investment from government and institutions about infrastructure in order to enhance companies ‘s productivity as well as the ability recruit employee with lower cost in training. Moreover, there are so many companies , institutions , specialized suppliers concentrate in the cluster so the employees have a good condition to develop and it opens up more opportunities for finding job so the cluster help attract best talent and produce specialized and experienced employees to reduce search and transaction cost in recruiting. On the other hand, each member can be easy to find and access specialized information which accumulated within the cluster so promote the flow of information and make information more transferable to facilitate technological and knowledge spillovers. The linkage between complementarities business such as hotels, restaurants, outlets and so on are more and more closed. The complementarities development signal opportunities for many employees, develop flowing services to gain more profits as well as create employment opportunities. The cluster also raise in marketing .It enhances reputation of a location in a particular field so it’s easy to attract customer, easy to meet their need when they want to do business. For example entertainment in Hollywood, finance on Wall Street, consumer electronics in Japan . Finally, clusters are often make it easier to compare performance among local rivalry so clusters play important role in competition to renew, reform and spark innovation and new business.…

    • 1469 Words
    • 6 Pages
    Powerful Essays