Symbolic Data Mining in Databases Using the Apriori Algorithm

Dátum
2013-12-06T09:30:39Z
Folyóirat címe
Folyóirat ISSN
Kötet címe (évfolyam száma)
Kiadó
Absztrakt

This thesis discusses symbolic data mining, its concepts and uses in the discovery of knowledge hidden in large datasets. Data mining is a multidisciplinary field, drawing work from areas including database technology, machine learning, statistics, pattern recognition, information retrieval, neural networks, knowledge-based systems, artificial intelligence, high-performance computing, data visualization, etc. We present techniques and methods for the discovery of patterns hidden in large datasets, and on issues relating to their feasibility, usefulness, effectiveness and scalability. We also talk about data mining and data warehousing, data mining and OLAP and the extraction of valid and ultimately understandable patterns through frequent and rare itemsets mining using the Apriori algorithm. This thesis has implemented Apriori algorithm in Java to show and improve the understanding of how frequent and rare patterns are gotten from a given dataset. We found rare pattern mining compelling because most patterns and rules with high support (which are frequent patterns) are obvious and well known to domain experts; this therefore makes rules and patterns with low support (which are the rare patterns) interesting because they provide new and interesting insights.

Leírás
Kulcsszavak
Apriori
Forrás