Hasso-Plattner-Institut25 Jahre HPI
Hasso-Plattner-Institut25 Jahre HPI

Applied Probabilistic Machine Learning (Wintersemester 2021/2022)

Dozent: Dr. Katharina Baum (Data Analytics and Computational Statistics) , Hugues Richard (Data Analytics and Computational Statistics) , Elizabeth Yuu (Data Analytics and Computational Statistics)
Website zum Kurs: https://moodle.hpi.de/course/view.php?id=222

Allgemeine Information

  • Semesterwochenstunden: 4
  • ECTS: 6
  • Benotet: Ja
  • Einschreibefrist: 01.10.2021 -22.10.2021
  • Lehrform: Vorlesung / Übung
  • Belegungsart: Wahlpflichtmodul
  • Lehrsprache: Englisch
  • Maximale Teilnehmerzahl: 30

Studiengänge, Modulgruppen & Module

IT-Systems Engineering MA
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-K Konzepte und Methoden
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-T Techniken und Werkzeuge
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-S Spezialisierung
  • ISAE: Internet, Security & Algorithm Engineering
    • HPI-ISAE-K Konzepte und Methoden
  • ISAE: Internet, Security & Algorithm Engineering
    • HPI-ISAE-S Spezialisierung
  • ISAE: Internet, Security & Algorithm Engineering
    • HPI-ISAE-T Techniken und Werkzeuge
Data Engineering MA
Digital Health MA
Cybersecurity MA


In all areas of health and life science, large amounts of data are generated. Data analysis, visualisation and prediction necessitate the design and implementation of dedicated procedures. It consists in the first step towards predictions and inference based for decision making.

Computational statistical methods have evolved to cope with challenges arising from large datasets that are not tractable with traditional approaches, e.g. when the dimensionality of the data largely exceeds the number of observations.

In this course we will learn to use probabilistic models as a principled tool to describe the dependencies between the variables of a system, while accounting for uncertainties. Whereas highly efficient machine learning strategies often work as black boxes, explicit probabilistic models provide a way for users to preserve explainability while still solving a large range of problems.

We will see how inference with those models can be cast in a practical framework for typical machine learning tasks such as classification or clustering. The course will be focused on Graphical Models, a class of models that can capture conditional dependencies between variables and have proven being useful for multiple application (document classification, gene detection, ...). Those models will be applied to realistic case studies through small projects.

Learning Objectives:

  • Understand all the typical variations of graphical models (mixture models, Markov, Hidden Markov Models, Bayesian Networks, Topic models, Probabilistic PCA).
  • Understand techniques for evaluating the fit of a model and inferring its parameters (Optimisation and EM algorithm)
  • Apply those models to Machine learning tasks (Supervised/Unsupervised learning)
  • Understand the underlying algorithmic challenges for training models and analyse them.
  • Be able to design and train a model for a given task.


  • Good knowledge in calculus (functional optimisation) and linear algebra is recommended
  • Knowledge of key statistical concepts (a refresher will be provided in the first lectures)
  • Basic knowledge of either Python or R programming language or very good skills in a third programming language
  • The course will be given in English


The course will be mainly based on the following book available online [here]

  • Bishop, Pattern recognition and Machine learning, Springer.

The following books will be used for complementary information:

  • Duda, Hart and Stork: Pattern classification
  • Murphy, Machine Learning: a probabilistic perspective
  • Koller and Friedman: Probabilistic Graphical Models

Lern- und Lehrformen

Lectures will be given on site in HE.51/52 (when possible) each Thursday starting from October 28, 2021,11:00-12:30. Additional course material will be available online to prepare before the lectures (e.g. asynchronous learning). Additionally, we plan online live sessions to discuss the exercises and present the small projects (approx. every two weeks). Call-in details will be provided in time.

Homework will consist in exercises (to be worked on paper) and small projects to be prepared as Jupyter notebooks. The correction of the exercises will be included into the lectures when suitable. The small projects will be prepared by the students in groups and handed in on a biweekly basis.

Online teaching material will be made available via the moodle 


  • Returning the small projects (50% of final grade)
  • Presentation of small projects (at least one per semester) (40% of final grade)
  • Student Participation (10% of final grade)
  • Students have to give a talk about their solution of an exercise at least once in the semester


Lectures will be given on site in HE.51/52 (when possible) each Thursday starting from October 28, 2021,11:00-12:30.

First grading: December 9, 2021 (opt-out December 1, 2021)