DEEPAK KUMAR D R
M.SC IN COMPUTER SCIENCE 3RD SEMESTER, DAVANGERE UNIVERSITY deepakrdevang@gmail.com Abstract: This topic is mainly used by a number of data mining techniques, such as clustering, nearest neighbor classification, and anomaly detection. And it can also include the data mining applications.In this paper we have focused a variety of techniques, approaches and different areas of the research which are helpful and marked as the important field of data mining Technologies. As we are aware that many MNC’s and large organizations are operated in different places of the different countries. Each place of operation may generate large volumes of data. Corporate decision makers require access from all such sources and take strategic decisions. In an uncertain and highly competitive business environment, the value of strategic information systems such as these are easily recognized however in today’s business environment, efficiency or speed is not the only key for competitiveness. This type of huge amount of data’s is available in the form of tera- to peta-bytes which has drastically changed in the areas of science and engineering.
Keywords: Aggregation, Anomaly detection and classification, Binarization, Clusters, Clustering’s, Data mining Applications, Dimensionality Reduction, Discretization Issue in proximity Calculation, Sampling, similarity and dissimilarity data objects etc.. 1.INTRODUCTION
Data preprocessing-Data preprocessing is an important and critical step in the data mining process, and it has ahuge impact on the success of a data mining project. The purpose of data preprocessing isto cleanse the dirty/noise data, extract and merge the data from sources and thentransform and convert the data into a proper formatData preprocessing has been studiedextensively in the past decade, and many commercial products such as
References: 12. C.-Y. Yeh, C.-W. Huang and S.-J. Lee, Multi-kernel support vector clustering for multi-class classification, International Journal of Innovative Computing, Information and Control, vol.6, no.5, pp.2245-2262, 2010. 13. B. Chen, L. Ma and J. Hu, An improved multi-label classi_cation method based on SVM with delicatedecision boundary, International Journal of Innovative Computing, Information and Control, vol.6,no.4, pp.1605-1614, 2010.