Algorithms for Pattern Mining
Description
In this seminar, the students will learn data mining techniques for discovering frequent itemsets. The discovery of frequent itemsets is a very useful technique for analyzing data, generating association rules, deriving machine learning features and many other applications.
We expect the students to examine existing techniques by implementing (and possibly improving) one approach and an extension of that approach (for instance "multisets" or "utility patterns"). You should develop a suitable use case for that extension of the chosen frequent pattern analysis algorithm. The students are free to use any data set for their use case. We regard the DBpedia Infobox triples as a promising option. At the end of the seminar, the students are asked to evaluate their implemented algorithm on their use case both quantitatively and qualitatively.
The maximum number of students is 6, resulting in 3 teams.
Important Dates
■ May 15th: intermediate presentation
■ July 10th: final presentations in room A-1.1
■ August 31st: short paper deadline
Slides
Grading Process
3 LP
Paper presentation and final presentation
Participation during presentations / discussions
Implementation of strategies/proposed extensions
6-page evaluation report
Literature
R. Agrawal & R. Srikant
Fast Algorithms for Mining Association Rules
VLDB '94
J. Han & J. Pei & Y. Yin
Mining Frequent Patterns without Candidate Generation
SIGMOD '00
M. J. Zaki
Scalable Algorithms for Association Mining
TKDE '00