Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
 

Collaborative Filtering

In this seminar, we want to apply different collaborative filtering techniques on the Yahoo! Music Dataset and possibly submit the results to the kddcup.yahoo.com.
The main challenges include a large set of data (1 million users, >600.000 items), a hierarchical organization of the data (artist -> album -> song), and exploitation of the additional time stamps of the ratings.

We expect the students to extend existing collaborative filtering techniques to solve these challenges.

The maximum number of students is 8, resulting in 4 teams. As the cup offers two track, we would like to evenly distribute two teams to each task.

Time schedule

To participate, please join us at the first meeting on April 14th in H.E-52.

As the submission deadline for KDD cup is June 30, the seminar ends shortly thereafter. Therefore, we expect students to invest most of their time in the first 2 months.

We want to have short (intermediate) presentations roughly every 2 weeks. The presentations should last 15 minutes + 5 minutes discussion to allow all 4 groups to present their progress in one meeting. In other weeks, we propose consultations with the supervisor.

  • April 14th: first seminar, topic presentations
  • April 16th: application deadline, team/paper preferences
  • April 17th: team/paper notification
  • April 21th: mandatory consultation
  • April 28th: paper presentation
  • May 12th: initial implementation/idea presentation
  • June 9th: intermediate presentation/project consolidation
  • June 30th: KDD cup submission deadline (results only)
  • July 7th: final presentations

Slides

DateTopicSlides
April 14th 2011Introduction and Organizationpdf
April 28th 2011

Paper presentations

Item-based CFpdf
Matrix Factorizationpdf
Who rated what?pdf
Mai 12th 2011Ideas presentations
Item-based CFpdf
Matrix Factorizationpdf
Who rated what?pdf
Mai 19th 2011Wekapdf
June 9th 2011Intermediate Presentations
item-based CFpdf
Matrix Factorizationpdf
Who rated what?pdf
Juli 7th 2011Final Presentations
Item-based CFpdf
Matrix Factorizationpdf
Who rated what?pdf

Grading Process

  • 3 LP
  • Paper and final presentations
  • Participation during intermediate presentations / discussions
  • Implementation strategies/proposed extensions
  • Short paper
  • Bonus: good results in KDD cup

Literature

Overview

Su, X. & Khoshgoftaar, T.M.
A survey of collaborative filtering techniques
Advances in Artificial Intelligence, Vol. 2009
Hindawi Publishing Corporation, 2009

Track 1

Sarwar, B., Karypis, G., Konstan, J. & Reidl, J.
Item-based collaborative filtering recommendation algorithms
Proceedings of the 10th international conference on World Wide Web
ACM, 2001

Bell, R.M. & Koren, Y.
Improved Neighborhood-based Collaborative Filtering
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
ACM, 2007

Koren, Y., Bell, R. & Volinsky, C.
Matrix Factorization Techniques for Recommender Systems
Computer, Vol. 42
IEEE Computer Society Press, 2009

Track 2

Kurucz, M., Benczúr, A.A., Kiss, T., Nagy, I., Szabó, A. & Torma, B.
Who Rated What: a Combination of SVD, Correlation and Frequent Sequence Mining
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
ACM, 2007

Sueiras, J.
A classical predictive modeling approach for Task "Who rated what?" of the KDD CUP 2007
Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining
ACM, 2007