Hasso-Plattner-Institut
Prof. Dr. Felix Naumann
 

Lecturer

Dr. Gjergji Kasneci

Abstract

Data arising from business transactions, scientific measurements and other forms of content-creation calls for automatic data mining and pattern recognition techniques that allow us to efficiently make sense of this data. At the same time these techniques should be able to handle uncertainty, as data from measurements may be imprecise and user-generated content may be unreliable.

This lecture will introduce the main concepts of data mining and probabilistic reasoning, ranging from basic probability and information theory to popular classification and clustering algorithms. An introduction to the exciting area of graphical models and probabilistic inference will highlight the link between uncertainty and probabilistic learning models.

Topics

Probability theory, information theory, classification, regression, clustering, graphical models

Literature

I. H. Witten, E. Frank, M. A. Hall: Data Mining - Practical Machine Learning Tools and Techniques (Chapters 1 - 6)

C. Bishop: Pattern Recognition and Machine Learning (Chapters 1 - 4, 8, 9)

T. M. Mitchell: Machine Learning (Chapters 3 - 6, 8, 10)

P. Flach: Machine Learning – The Art and Science of Algorithms that make Sense of Data (Chapters 1 – 3, 5 – 11)

D. J. C. MacKay: Information Theory, Inference and Learning Algorithms (Chapters 1 - 6)

Timetable

  • Lectures:
    • Tuesdays 13:30-15:00 in Room H-E.51
    • Every second Thursday 11:00-12:30 in Room  H-2.57
  • Exercises: 
    • Every second Thursday 11:00-12:30 in Room  H-2.57
    • The assignments are available in the "Materialien" folder in the "Interner Bereich"

We thank all students for their help in pointing out errors in the slides!

The exam timetable and room information are now available in the "Interner Bereich"

Date

Topicpdf
15.10.2013Introduction & examplespdf (final)
17.10.2013 Basics of probability theorypdf (final, Version 2 20.11.2013)
22.10.2013Basics of statistics (part I)pdf (final, Version 2 12.11.2013)
24.10.2013Exercise 1available in "Interner Bereich"
29.10.2013Basics of statistics (part II)
31.10.2013No lecture; public holiday
05.11.2013Basics of information theorypdf (final)
07.11.2013Exercise 2available in "Interner Bereich"
12.11.2013Introduction to classificationpdf (final)
14.11.2013Linear classification models (part I)pdf (final, Version 2 06.12.2013)
19.11.2013Linear classification models (part II)
21.11.2013Exercise 3available in "Interner Bereich"
25.11.2013Linear classification models (part III)
28.11.2013canceled 
02.12.2013Artifical Neural Networks (part I)pdf (final)
05.12.2013Exercise 4available in "Interner Bereich"
10.12.2013Artifical Neural Networks (part II)
12.12.2013Non-linear classification models (part I)pdf (final)
17.12.2013Non-linear classification models (part II)
19.12.2013Exercise 5available in "Interner Bereich"
07.01.2014Regressionpdf (final)
09.01.2014General clustering algorithmspdf (final)
14.01.2014Clustering: Topic Models (part I)pdf (final)
16.01.2014Clustering: Topic Models (part II) (Note: Moved from Tuesday 21st)
21.01.2014Exercise 6 (Note: Moved from Thursday 16th)available in "Interner Bereich"
23.01.2014Clustering: Topic Models (part III)
28.01.2014Graphical Models (part I)pdf (final, Version 2 05.02.2014)
30.01.2014Exercise 7available in "Interner Bereich"
04.02.2014Graphical Models (part II), Inference in graphical modelspdf (final)
06.02.2014Summary and Exam preparation

Exam

Condition for exam admission: oral presentation of at least two solutions during the tutorials 

Form of exam: oral exam at the end of the term