WITH EARLY PRUNING
a thesis submitted to the department of computer engineering and information science and the institute of engineering and science of bilkent university in partial fulfillment of the requirements for the degree of master of science
By
Necip Faz l Ayan
July, 1999
ii
I certify that I have read this thesis and that in my opinion it is fully adequate, in scope and in quality, as a thesis for the degree of Master of Science.
Prof. Dr. Erol Arkun(Principal Advisor)
I certify that I have read this thesis and that in my opinion it is fully adequate, in scope and in quality, as a thesis for the degree of Master of Science.
Assoc. Prof. Dr. Ozgur Ulusoy
I certify that I have read this thesis and that in my opinion it is fully adequate, in scope and in quality, as a thesis for the degree of Master of Science.
Asst. Prof. Dr. Ugur Gudukbay
Approved for the Institute of Engineering and Science:
Prof. Dr. Mehmet Baray, Director of Institute of Engineering and Science
iii
ABSTRACT
UPDATING LARGE ITEMSETS
WITH EARLY PRUNING
Necip Faz l Ayan
M.S. in Computer Engineering and Information Science
Supervisor: Prof. Dr. Erol Arkun
July, 1999
With the computerization of many business and government transactions, huge amounts of data have been stored in computers. The existing database systems do not provide the users with the necessary tools and functionalities to capture all stored information easily. Therefore, automatic knowledge discovery techniques have been developed to capture and use the voluminous information hidden in large databases. Discovery of association rules is an important class of data mining, which is the process of extracting interesting and frequent patterns from the data. Association rules aim to capture the co-occurrences of items, and have wide applicability in many areas. Discovering association rules is based on the computation of large itemsets (set of items
Bibliography: Press, 1996. rules. IEEE Transactions on Knowledge and Data Engineering, 8(6):962{969, 1996. An overview from database perspective. IEEE Transactions on Knowledge and Data Engineering, 8(6):866{883, 1996.