Author: Hemendra Pal Singh*
In this review “Data Mining: The Mushroom Database” is focuses in the study of database or datasets of a mushroom. The purpose of the research is to broaden the preceding researches by administer new data sets of stylometry, keystroke capture, and mouse movement data through Weka. Weka stands for Waikato environment for knowledge analysis, and it is a popular suite of machine learning software written in Java, developed at the University of Waikato. WEKA is free downloadable software and it is available under the GNU General Public License. To recognize the datasets and database of a mushroom the researchers uses Data Mining through WEKA using various data mining algorithms. The study will also broaden earlier research at Pace University into the uses of a human- machine interface to increase the correctness of machine learning.
In order to explain the use of various algorithms in this study, the algorithms will be discussed in this research. Naïve Bayes and Apriori will be used against the Stylometry data set. IBk will be used against the Keystroke Capture and Mouse Movement data sets. J48 will be used with the Mushroom Database. The choices of these techniques and their implementation will be discussed in detail in the methodologies section. According to Witten and Frank in Data Mining, the Naïve Bayes method is, “based on Bayes’srule and ‘Naïvely’ assumed independence — it is only valid to multiply probabilities when the According to Witten and Frank in Data Mining, the Naïve Bayes method is, “based on Bayes’s rule and ‘Naïvely’ assumed independence — it is only valid to multiply probabilities when the events are independent. The assumption that attributes are independent in real life certainly is simplistic one events are independent. The assumption that attributes are independent in real life certainly is a simplistic one.
The methodologies that they use are several different