Practical Applications of Deep Learning (Sommersemester 2021)

Dozent: Dr. Haojin Yang (Internet-Technologien und -Systeme) , Joseph Bethge (Internet-Technologien und -Systeme) , Hendrik Rätz (Data Analytics and Computational Statistics)

Allgemeine Information

Semesterwochenstunden: 4
ECTS: 6
Benotet: Ja
Einschreibefrist: 18.03.2021 - 09.04.2021
Lehrform: Seminar
Belegungsart: Wahlpflichtmodul
Lehrsprache: Englisch
Maximale Teilnehmerzahl: 9

Studiengänge, Modulgruppen & Module

IT-Systems Engineering MA

IT-Systems Engineering
- HPI-ITSE-E Entwurf
IT-Systems Engineering
- HPI-ITSE-K Konstruktion
ISAE: Internet, Security & Algorithm Engineering
- HPI-ISAE-K Konzepte und Methoden
ISAE: Internet, Security & Algorithm Engineering
- HPI-ISAE-T Techniken und Werkzeuge
ISAE: Internet, Security & Algorithm Engineering
- HPI-ISAE-S Spezialisierung
OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-K Konzepte und Methoden
OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-T Techniken und Werkzeuge
OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-S Spezialisierung

Data Engineering MA

DATA: Data Analytics
- HPI-DATA-K Konzepte und Methoden
DATA: Data Analytics
- HPI-DATA-T Techniken und Werkzeuge
DATA: Data Analytics
- HPI-DATA-S Spezialisierung
PREP: Data Preparation
- HPI-PREP-K Konzepte und Methoden
PREP: Data Preparation
- HPI-PREP-T Techniken und Werkzeuge
PREP: Data Preparation
- HPI-PREP-S Spezialisierung

Digital Health MA

SCAD: Scalable Computing and Algorithms for Digital Health
- HPI-SCAD-C Concepts and Methods
SCAD: Scalable Computing and Algorithms for Digital Health
- HPI-SCAD-T Technologies and Tools
SCAD: Scalable Computing and Algorithms for Digital Health
- HPI-SCAD-S Specialization
APAD: Acquisition, Processing and Analysis of Health Data
- HPI-APAD-C Concepts and Methods
APAD: Acquisition, Processing and Analysis of Health Data
- HPI-APAD-T Technologies and Tools
APAD: Acquisition, Processing and Analysis of Health Data
- HPI-APAD-S Specialization

Cybersecurity MA

Beschreibung

Artificial intelligence (AI) is the intelligence exhibited by computer. This term is applied when a machine mimics "cognitive" functions that humans associate with other human minds, such as "learning" and "problem solving". Currently researchers and developers in this field are making efforts to AI and machine learning algorithms which intend to train the computer to mimic some human skills such as "reading", "listening", "writing" and "making inference" etc. From the year 2006 "Deep Learning" (DL) has attracted more and more attentions in both academia and industry. Deep learning is a branch of machine learning, based on a set of algorithms that attempt to learn representations of data and model their high level abstractions. In a deep neural network, there are multiple so-called "neural layers" between the input and output. The algorithm is allowed to use those layers to learn higher abstraction, composed of multiple linear and non-linear transformations. Recently DL achieved record breaking results in many novel areas as e.g., beating humans in real-time strategy games (Starcraft), powering self-driving cars, achieving dermatologist-level classification of skin cancer etc. In our current research we focus on video analysis and multimedia information retrieval (MIR) by using Deep-Learning techniques.

Course language: German and English

Topics in this seminar:

Creating 'Fake' Lecture Videos Methods based on deep learning are very powerful in the area of computer vision. Since the introduction of Generative Adversarial Networks (GANs) by Goodfellow et al. the area of image generation has seen a large boost in popularity and today, very strong methods are available that can generate very realistic images with high resolution. The power of generative networks also comes with several downsides. One of the downsides is the possibilty to generate fake data that is (for a human) nearly indistinguishable from real data. The so called deep fakes raise ethical discussions and also provide motivation for more research into the detection whether image/video material is faked or not. But, being able to fake data does not necessarily have to be a bad thing. Think about a video platform like OpenHPI, where lecture videos have to be recorded. These videos have to be post-processed in order to get rid of mistakes made by the speakers and give the video a professional touch. This post-processing is highly time consuming and could benefit from automatic methods that help to seamlessy remove mistakes. Another possibility could be to generate a realistic looking video by only supplying a written transcript and letting the computer do the rest. In this seminar topic, we want to have a look at automatic methods for editing videos with only one speaker (which is the typical setting of lecture videos). If everything works as planned, we will implement a method, where the user is able to modify the recorded video sequence, by modifying an automatically extracted transcript of the words spoken by the lecturer. (Introduction Video can be found here.)
Optimizing Inference of Binary Neural Networks: Convolutional neural networks have achieved astonishing results in different application areas. Various methods which allow us to use these models on mobile and embedded devices have been proposed. Especially Binary Neural Networks (BNNs) seem to be a promising approach for devices with low computational power or applications which have real-time requirements. In this topic you are going to optimize the inference of BNNs with BMXNet 2, based on advances in other frameworks. The goal at the end is to run a real-time machine learning demo application on a RaspberryPi (provided by us), without relying on a network connection. (Introduction video can be found here.)
Handwriting Synthesis for improved Optical Character Recognition (OCR): In recent years, the field of handwriting OCR remained an interesting field of study because it is still challenging to correctly handle the variation that is naturally contained in handwriting since it may differ from person to person. That can be problematic because a trained OCR model might not be able to recognize the handwriting of authors whose writing is too different from the training samples. A possible solution for this problem is the usage of Generative Adversarial Networks (GANs) to synthesize handwritten samples, which have the same style as the author's writing. These generated samples can then be used to finetune the original OCR model and hopefully increase the accuracy on the samples of the newly introduced authors. (Intro video can be found here)

We are currently preparing more detailed video introductions to these topics:

This Video (10 min) contains the introduction to the topic Creating 'Fake' Lecture Videos.
This Video (12 min) contains a brief introduction to the topic Optimizing Inference of Binary Neural Networks.
This Video (4 min) contains a brief introduction to the topic Handwriting Synthesis for OCR.

Voraussetzungen

Strong interests in video/image processing, machine learning (Deep Learning) and/or computer vision
Software development in C/C++ or Python
Experience with OpenCV and machine learning applications as a plus

Literatur

Books:

Alex Smola, Mu Li et al., Dive into deep learning
Ian J. Goodfellow and Yoshua Bengio and Aaron Courville, "Deep Learning", online version

Online courses:

cs231n tutorials: Convolutional Neural Networks for Visual Recognition
Deep Learning courses at Coursera

Deep Learning frameworks:

Leistungserfassung

The final evaluation will be based on:

Initial implementation / idea presentation, 10%
Final presentation, 20%
Report/Documentation, 12-18 pages, 30%
Implementation, 40%
Participation in the seminar (bonus points)

Termine

Monday, 15:15 - 16:45

(apart from the presentations, there will be no regular meetings in our seminar room!)

Virtual room

09.04.2021 14:00-15:00	QA session for seminar topics. (join us here)
until 09.04.2021	Belegung des Seminars beim Studienreferat (Studienreferat(at)hpi.uni-potsdam.de) (Send your preferred and secondary topic to: haojin.yang@hpi.de)
13.04.2021 (TUESDAY) 15:15 - 16:45	Meet other interested students, find a team, and ask questions. (join us here) (Send your preferred and secondary topic to: haojin.yang@hpi.de)
until 20.04.2021	Topics and Teams finalized
weekly	Meetings with your tutor
08.06.2021 (13:30)	Mid-Term presentation (15+5min)
20.07.2021 (13:30)	Final presentation (15+5min)
until end of August 2021	Hand in code and paper (Latex template)
until end of September 2021	Grading

Zurück