Prof. Dr. Tilmann Rabl

Data Management in ML Systems


Prof. Dr. Tilmann Rabl, Hannah MarienwaldIlin Tolovski, Nils Straßenburg


Distributing the machine learning pipeline has enabled researchers to create increasingly complex model architectures that can be trained using enormous data sets. However, with the increase of the model architecture complexity, the training time and the model size follow this upward trend as well. The results are usually giant models that take a long time to train, are inefficient to update, and are difficult to store and retrieve. A great deal of the distributed training pipeline goes to parameter transfers and synchronization between devices in order to coherently advance the training. When it comes to model storage, current research shows the parameters account for a great majority of a model's storage footprint (ca. 99%). In this course, we want to address several aspects of the model management life-cycle, i.e., efficient communication patterns when training, storage, and retrieval. We will develop methods that will allow us to execute efficient parameter transfers during the training process. This should minimize the network traffic and help reduce the training time. Moreover, efficiently storing model parameters will help reduce the model's storage footprint and the computational cost of model querying. Therefore, employing data management techniques would allow us to make distributed ML systems more efficient with regard to training time and storage.



This seminar will be structured around working on project topics in the field of Data Management in Machine Learning Systems. The students can work in groups of 2 to develop a project idea, implement, and evaluate it. At the end of the course, the students should present their findings and hand in a written report on their topic. We offer the possibility to publish the project results at a topic-related conference.

Paper presentations

In this course, the students will have the opportunity to prepare discussion sessions on the state-of-the-art research in machine learning systems. This includes studying a research paper in detail, presenting it in front of the group, introducing valuable insights, and leading the following discussion. To be adequately prepared for this, we will beforehand discuss the best practices for reading, writing and presenting scientific papers. Ideally, the papers that will be presented in our sessions would cover the related work of the chosen project topics. Every week, each student will need to summarize one of the presented papers in a one-pager.


  • Project + report - 60%
  • Final presentation - 20%
  • Paper presentations - 20%


  • The course will be conducted on-site at HPI. The lectures will take place on Thursdays, 13:30 - 15:00 in room F - E.06.
  • There is an option to follow the seminar online. The zoom link will be shared in Moodle. 
  • Course management via Moodle. There we will make any announcements and share course materials.
  • HPI Moodle Course
  • The course is limited to 12 students.
  • If you have any questions, please contact me at ilin.tolovski (at) hpi.de


  • Week 1: Introduction to the seminar: Data Management in ML Systems
    • Course Logistics
    • Introduction to Deep Learning
    • Model Training in Distributed Environments
    • Model Management
    • Discussion of open research questions
    • Present project topics
  • Week 2: How to read a scientific paper
  • Week 3: Paper presentations
  • Week 4: Paper presentations
  • Week 5: Paper presentations
  • Week 6: Proposal presentations 
  • Week 7: Paper presentations
  • Week 8: Paper presentations
  • Christmas Break: 20.12.2021 - 02.01.2022
  • Week 9: Paper presentations
  • Week 10: Project consultation
  • Week 11: Intermediate Presentation
  • Week 12: How to write a scientific paper
  • Week 13: Project consultation
  • Weeks 14+15: Final Presentations (15 min presentations)
  • Deadline for reports: 28.02.2022