Hasso-Plattner-InstitutSDG am HPI
Hasso-Plattner-InstitutDSG am HPI

Practical Applications of Deep Learning (Sommersemester 2020)

Dozent: Dr. Haojin Yang (Internet-Technologien und -Systeme) , Christian Bartz (Internet-Technologien und -Systeme) , Joseph Bethge (Internet-Technologien und -Systeme) , Goncalo Filipe Torcato Mordido (Internet-Technologien und -Systeme)

Allgemeine Information

  • Semesterwochenstunden: 4
  • ECTS: 6
  • Benotet: Ja
  • Einschreibefrist: 06.04.2020 - 22.04.2020
  • Lehrform: Seminar
  • Belegungsart: Wahlpflichtmodul
  • Lehrsprache: Englisch
  • Maximale Teilnehmerzahl: 12

Studiengänge & Module

IT-Systems Engineering MA
  • ITSE-Analyse
  • ITSE-Entwurf
  • ITSE-Konstruktion
  • ITSE-Maintenance
  • ISAE-Spezialisierung
  • ISAE-Techniken und Werkzeuge
  • OSIS-Konzepte und Methoden
  • OSIS-Spezialisierung
  • OSIS-Techniken und Werkzeuge
  • ISAE-Konzepte und Methoden
Data Engineering MA
Digital Health MA
Cybersecurity MA


Artificial intelligence (AI) is the intelligence exhibited by computer. This term is applied when a machine mimics "cognitive" functions that humans associate with other human minds, such as "learning" and "problem solving". Currently researchers and developers in this field are making efforts to AI and machine learning algorithms which intend to train the computer to mimic some human skills such as "reading", "listening", "writing" and "making inference" etc. From the year 2006 "Deep Learning" (DL) has attracted more and more attentions in both academia and industry. Deep learning is a branch of machine learning, based on a set of algorithms that attempt to learn representations of data and model their high level abstractions. In a deep neural network, there are multiple so-called "neural layers" between the input and output. The algorithm is allowed to use those layers to learn higher abstraction, composed of multiple linear and non-linear transformations. Recently DL achieved record breaking results in many novel areas as e.g., beating humans in real-time strategy games (Starcraft), powering self-driving cars, achieving dermatologist-level classification of skin cancer etc. In our current research we focus on video analysis and multimedia information retrieval (MIR) by using Deep-Learning techniques.

Course language: German and English

Topics in this seminar:

  • Creating 'Fake' Lecture Videos Methods based on deep learning are very powerful in the area of computer vision. Since the introduction of Generative Adversarial Networks (GANs) by Goodfellow et al. the area of image generation has seen a large boost in popularity and today, very strong methods are available that can generate very realistic images with high resolution. The power of generative networks also comes with several downsides. One of the downsides is the possibilty to generate fake data that is (for a human) nearly indistinguishable from real data. The so called deep fakes raise ethical discussions and also provide motivation for more research into the detection whether image/video material is faked or not. But, being able to fake data does not necessarily have to be a bad thing. Think about a video platform like OpenHPI, where lecture videos have to be recorded. These videos have to be post-processed in order to get rid of mistakes made by the speakers and give the video a professional touch. This post-processing is highly time consuming and could benefit from automatic methods that help to seamlessy remove mistakes. Another possibility could be to generate a realistic looking video by only supplying a written transcript and letting the computer do the rest. In this seminar topic, we want to have a look at automatic methods for editing videos with only one speaker (which is the typical setting of lecture videos). If everything works as planned, we will implement a method, where the user is able to modify the recorded video sequence, by modifying an automatically extracted transcript of the words spoken by the lecturer. (Introduction Video can be found here.)
  • Post-Training Quantization of Generative Adversarial Networks Generative Adversarial Networks (Goodfellow et al. 2014) have been widely applied in a successful manner to several image generation tasks, with its framework originally consisting of two models: one generative and one discriminative network. Even though both models are used during training, in the end, the discriminative network is discarded and only the generative network is used to generate the desired fake data. The goal of this topic is to study how to utilize the discriminative model to help successfully quantize the generative model after training by reducing its overall size with minimal loss of performance. (Introduction Video can be found here.)
  • Optimizing Inference of Binary Neural Networks: Convolutional neural networks have achieved astonishing results in different application areas. Various methods which allow us to use these models on mobile and embedded devices have been proposed. Especially Binary Neural Networks (BNNs) seem to be a promising approach for devices with low computational power or applications which have real-time requirements. In this topic you are going to optimize the inference of BNNs with BMXNet 2, based on advances in other frameworks. The goal at the end is to run a real-time machine learning demo application on a RaspberryPi (provided by us), without relying on a network connection. (Introduction video can be found here.)
  • 'Light' Neural Architecture Search: Previous work in the area of neural architecture search has shown, that machine-optimized networks can perform better, than those designed by humans. In this topic you are going to implement neural architecture search specifically for BNNs, which focuses on keeping the number of operations low and the networks fast. Based on the two networks BinaryDenseNet and MeliusNet you are going to optimize the number of blocks for these networks. We suggest the implementation in BMXNet 2, but Larq is an alternative, based on Keras and Tensorflow. (Introduction video can be found here.)

We are currently preparing more detailed video introductions to these topics:

  • This Video (11 min) contains the introduction to the topic Creating 'Fake' Lecture Videos.
  • This Video (5 min) contains the introduction to the topic Post-Training Quantization of Generative Adversarial Networks
  • This Video (12 min) contains a brief introduction into Binary Neural Networks in general and the two topics Optimizing Inference of Binary Neural Networks and 'Light' Neural Architecture Search.


  • Strong interests in video/image processing, machine learning (Deep Learning) and/or computer vision
  • Software development in C/C++ or Python
  • Experience with OpenCV and machine learning applications as a plus



  • Alex Smola, Mu Li et al., Dive into deep learning
  • Ian J. Goodfellow and Yoshua Bengio and Aaron Courville, "Deep Learning", online version
  • Pedro Domingos “The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World”
  • Christopher M. Bishop “Pattern Recognition and Machine Learning” google it

Online courses:

  • cs231n tutorials: Convolutional Neural Networks for Visual Recognition
  • Deep Learning courses at Coursera

Deep Learning frameworks:


The final evaluation will be based on:

  • Initial implementation / idea presentation, 10%
  • Final presentation, 20%
  • Report (research paper), 12-18 pages, 30%
  • Implementation, 40%
  • Participation in the seminar (bonus points)


Monday, 15:15 - 16:45

(apart from the presentations, there will be no regular meetings in out seminar room!)

Room H-E.51 (or in our virtual room)

from 15.04.2020

Presentation of Seminar Topics (Videos)

20.04.2020 (15:15) Meet other interested students, find a team, and ask questions. (join us here)

until 22.04.2020

Belegung des Seminars beim Studienreferat

(Send your preferred and secondary topic to: christian.bartz[ätt]hpi.de)

until 29.04.2020

Topics and Teams finalized


Meetings with your tutor

08.06.2020 (15:15)

(Date changed!)

Mid-Term presentation (15+5min)

20.07.2020 (15:15)

Final presentation (15+5min)

until end of August 2020

Hand in code and paper (Latex template)

until end of September 2020