Hasso-Plattner-InstitutSDG am HPI
Hasso-Plattner-InstitutDSG am HPI

Machine Intelligence with Deep Learning (Wintersemester 2023/2024)

Dozent: Dr. Haojin Yang (Internet-Technologien und -Systeme) , Gregor Nickel , Jona Otholt (Internet-Technologien und -Systeme) , Weixing Wang

Allgemeine Information

  • Semesterwochenstunden: 4
  • ECTS: 6
  • Benotet: Ja
  • Einschreibefrist: 01.10.2023 - 31.10.2023
  • Prüfungszeitpunkt §9 (4) BAMA-O: 05.02.2024
  • Lehrform: Seminar / Projekt
  • Belegungsart: Wahlpflichtmodul
  • Lehrsprache: Englisch
  • Maximale Teilnehmerzahl: 12

Studiengänge, Modulgruppen & Module

IT-Systems Engineering MA
  • ISAE: Internet, Security & Algorithm Engineering
    • HPI-ISAE-K Konzepte und Methoden
  • ISAE: Internet, Security & Algorithm Engineering
    • HPI-ISAE-T Techniken und Werkzeuge
  • ISAE: Internet, Security & Algorithm Engineering
    • HPI-ISAE-S Spezialisierung
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-K Konzepte und Methoden
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-T Techniken und Werkzeuge
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-S Spezialisierung
Data Engineering MA
Digital Health MA
Cybersecurity MA
Software Systems Engineering MA


Artificial intelligence (AI) is the intelligence exhibited by computer. This term is applied when a machine mimics "cognitive" functions that humans associate with other human minds, such as "learning" and "problem solving". In the past five years, there have been significant advances in the field of AI. Some of the notable breakthroughs include the development of deep learning algorithms that have enabled AI systems to surpass human performance in a range of tasks, including image and speech recognition, natural language processing, and game playing. There has also been a rapid growth in the availability of data and computing power, which has fueled progress in AI research.

Looking ahead to the next decade, the prospects for AI are even more exciting. Many experts predict that AI will continue to transform a wide range of industries, from healthcare and transportation to finance and education. However, there are also concerns about the potential impact of AI on employment, privacy, and security, and it will be important to address these issues as AI becomes more integrated into our lives. For young university students, the developments in AI mean that there will be many exciting career opportunities in the field, both in research and in industry. AI will likely play an increasingly important role in shaping our world, and those who understand its potential and limitations will be well positioned to make a difference. At the same time, it will be important for students to consider the ethical implications of AI and to develop a nuanced understanding of how it can be used to benefit society.

In this seminar, the project topics we provide are closely related to our current research work. We will help students understand the topics and learn relevant knowledge during the course. Typically, after attending the seminar, our students become familiar in using deep learning frameworks and have a good understanding of the related research topics.

Course language: English and German

Topics in this seminar:

  • ViT meets BNN: In recent years the transformer architecture has become particularly popular with large natural language processing (NLP) models. A transformer is a neural network that mimics cognitive attention. The main idea is to focus on small but important parts of the data. In the case of NLP, transformers measure the relationship between a pair of input tokens, words in the case of text strings.
    A Transformer-like architecture can also be applied to computer vision tasks, in this case, we speak Vision Transformers (ViT). Compared to Convolutional Neural Networks (CNNs), transformers lack translation invariance and a locally restricted receptive field. Imagine that it doesn’t make much sense to measure the relationship between individual pixels in an image. For this reason, the image is split into patches of (16x16) pixels. The linear embeddings of the flattened patches are calculated and positional embeddings are applied. This sequence is used as input to a standard transformer encoder. ViT models outperform CNNs in accuracy, with the drawback of being very computationally intensive. In this seminar topic, we will try to apply the binary neural network (BNN) technique on ViTs to determine the training effort and how close we are to state-of-the-art image classification results.
  • LLMs in control: LLMs (Large Language Models) are typically multitasking. An LLM is capable of understanding, interpreting, and generating human-like text based on the input. This is possible because it is trained on extensive corpora of textual data, which allows it to demonstrate a deep understanding of language, context, and semantics.
    While it is fascinating to observe the generation of paragraphs, it is also important to not only track and control this process but also align it with our expectations. This initiative leads to a multitude of research topics that warrant further exploration. For instance, a area of focus is how to train LLM models to generate less toxic and bias-free content. Additionally, we are interested in investigating techniques to extract and modify specific knowledge that is stored as parameters within the model. In this seminar, our main focus will be on Transformer Decoder-only models such as GPT and the LlaMA family. These models have the ability to generate content in an autoregressive manner, meaning that the prediction of the next token in the sequence depends on the previous tokens. Our goal is to dive deep into this decoding process and strive to enhance it for safer, more controllable generation.
  • Extreme weather events such as tropical cyclones can cause widespread destruction and even result in human casualties. Accurately detecting them aids scientists in their analysis and help to better understand these phenomena. However, current detection heuristics are far from perfect and often disagree with each other as well as with human judgement. New deep-learning-based methods might be able to improve on these heuristic. however, due to the high cost of collecting expert labels, only a small amount of labeled samples are available. One paradigm to approach such tasks in other domains is to first conduct a self-supervised pretraining, and then finetune the pretrained model on the target task. In this project we will investigate if this approach can be applied to weather data as well, by attempting to tackling the ClimateNet benchmark using a mask-based pretraining.


  • Suggested prerequisites include completing HPI courses such as "Introduction to Deep Learning" (Prof. Lippert), "Introduction to Data Science, and Machine Learning" (Prof. de Melo), or similar MOOC courses.
  • Strong interests in video/image processing, machine learning (Deep Learning) and/or computer vision
  • Software development in Python or C/C++
  • Experience with Pytorch and machine learning applications as a plus



Deep Learning framework:


The final evaluation will be based on:

  • Initial implementation / idea presentation, 10%

  • Final presentation, 20%

  • Report (research paper), 12-18 pages, 30%

  • Implementation , 40%

  • Participation in the seminar (bonus)


Short Explanation: This is a project seminar with weekly meetings for each student group. However, we will arrange the meeting times more flexibly, independent of each individual's schedule, and will not enforce the specific time slots set in the seminar. The usual meeting location is our chair's meeting room, so the seminar room is generally not required. We typically allocate 3 fixed slots for the kick-off meeting, mid-term presentation, and final presentation. For these three slots, we usually need the seminar room.

17.10.2023 (15:15-16:30), Room: G1-E.15/16

  • Introduction and QA session for seminar topics. [Slides]

until 24.10.2023


until 28.10.2023

  • Topics and Teams finalized
  • Arranging individual meeting


  • Individual meeting with your tutor

19.12.2023 (15:15-16:30), G1-E.15/16

  • Mid-Term presentation (15+5min),

13.02.2024 (15:15-16:30), G1-E.15/16

  • Final presentation (15+5min)



until 31.03.2024

  • Grading finished