Publication Date

Spring 2011

Degree Type

Master's Project

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

T. Y. Lin

Second Advisor

Soon Tee Teoh

Third Advisor

Howard Ho

Abstract

Mining association rules is a very important aspect in data mining fields. The process to mine association rules not only take much time, but also take huge computing source. How to fast and efficiently find the large itemsets is a crucial point in the association rule algorithms. This paper will focus on two algorithms research and implementation in parallel computing environments. One is Bitmap Combination algorithm, the other is Bitmap FP-Growth algorithm. Compared to Apriori algorithm, both Bitmap Combination and Bitmap FP-Growth algorithms don’t need generate candidate items, avoids costly database scans. Both algorithms need to translate the original database to Bitmap format, analyze bit distribution to reduce database size and apply high-speed bit calculation to improve the algorithms. The divide-and-conquer replace generation-and-test idea as the basic strategy. Bitmap Combination Algorithm shows the quick combination skills between any two, three, four and more rows, then screening the qualified itemsets. Bitmap FP-Growth Algorithm apply special bit calculation to recursively mine association rules. Based on the experimental results in this paper, both algorithms greatly improve the efficiency and performance of mining association rules, especially provide the possibility to mine association rules in highly parallel computing environments.

Share

COinS