Hasso-Plattner-InstitutSDG am HPI
Hasso-Plattner-InstitutDSG am HPI

Applied Probabilistic Machine Learning (Wintersemester 2021/2022)

Lecturer: Prof. Dr. Bernhard Renard (Data Analytics and Computational Statistics) , Dr. Hugues Richard (Data Analytics and Computational Statistics) , Dr. Katharina Baum (Data Analytics and Computational Statistics)
Course Website: https://hpi.de/friedrich/moodle/course/view.php?id=127

General Information

  • Weekly Hours: 4
  • Credits: 6
  • Graded: yes
  • Enrolment Deadline: 01.10.2021 -22.10.2021
  • Teaching Form: Lecture / Exercise
  • Enrolment Type: Compulsory Module
  • Course Language: English
  • Maximum number of participants: 30

Programs & Modules

IT-Systems Engineering MA
Data Engineering MA
Digital Health MA
  • APAD-Concepts and Methods
  • APAD-Technologies and Tools
  • APAD-Specialization
Cybersecurity MA


In almost all areas of life, large amounts of data are generated, requiring dedicated procedures for data analysis to allow predictions and inference for decision making. Computational statistical methods have evolved to cope with challenges arising from large datasets that are not tractable with traditional approaches, e.g. when the dimensionality of the data largely exceeds the number of observations.

In this course we will learn to use probabilitic models as a principled tool to describe the dependencies between the variables of a system, while accounting for uncertainties.

We will see how inference with those models provide a practical framework for typical machine learning tasks such as classification or clustering. The course will be focused on Graphical Models, a class of models that can capture conditional dependencies betwen variables and have proven being useful for multiple application (document classification, gene detection...). Those models will be applied to realistic case studies through small projects.

Learning Objectives:

  • Understand the typical variations of graphical models (mixture models, Markov, Hidden Markov Models, Bayesian Network, topic models).
  • Understand techniques for evaluating the fit of a model and infering its parameters (EM algorithm)
  • Apply those models to Machine learning tasks (Supervised/Unsupervised learning)
  • Understand the underlying algorithmic challenges
  • Be able to design and train a model for a given task.


  • Good knowledge in Calculus (functional optimisation) and linear algebra
  • Kowledge of key statistical concepts
  • Either Basic knowledge of Python or R programming language or very good skills in a third programming language
  • The course will be given in english


The course will be mainly based on the following book available online [here]:

- Bishop, Pattern recognition and Machine learning, Springer.


The following books will be used for complementary information:

- Duda, Hart and Stork,  Pattern classification

- Murphy, Machine Learning: a probabilistic perspective

- Koller and Friedman Probabilistic Graphical Models


Lectures will be given on site when possible with a recording of the lectures made available. Call-In details will be provided in time.

Homework will consist in exercises (to be worked on paper) and small projects to be prepared as Jupyter notebooks. The correction of the exercises will be included into the lectures when suitable. The small projects will be prepared by the students in groups and handed in on a biweekly basis.

Recorded lectures will be made available via teletask.


  • Small projects grading: 50% of final grade

  • Presentation of small projects (at least one per semester) 40% of final grade

  • exercises and participation : 10% of final grade.