Hasso-Plattner-InstitutSDG am HPI
Hasso-Plattner-InstitutSDG am HPI

Practical Video Analyses (Sommersemester 2017)

Lecturer: Dr. Haojin Yang (Internet-Technologien und -Systeme)
Tutors: Christian Bartz

General Information

  • Weekly Hours: 4
  • Credits: 6
  • Graded: yes
  • Enrolment Deadline: 28.04.2017
  • Teaching Form: Seminar
  • Enrolment Type: Compulsory Elective Module
  • Maximum number of participants: 12

Programs & Modules

IT-Systems Engineering MA
  • ISAE-Spezialisierung
  • ISAE-Techniken und Werkzeuge
  • OSIS-Konzepte und Methoden
  • OSIS-Spezialisierung
  • OSIS-Techniken und Werkzeuge
  • ITSE-Analyse
  • ITSE-Entwurf
  • ITSE-Konstruktion
  • ITSE-Maintenance


 In the last decade digital libraries and web video portals have become more and more popular. The amount of video data available on the World Wide Web (WWW) is growing rapidly. According to the official statistic-report of the popular video portal YouTube more than 400 hours of video are uploaded every minute. Therefore, how to efficiently retrieve video data on the web or within large video archives has become a very important and challenging task.

In our current research we focus on video analysis and multimedia information retrieval (MIR) by using Deep-Learning techniques. Deep Learning (DL), as a new area of machine learning (since 2006), has already been impacting a wide range of multimedia information processing. Recently, the techniques developed based on DL achieved substantial progress in fields including Computer Vision, Speech Recognition, Image Classification and NLP etc.

Topics in this seminar:

  • A general LSTM (Long Short Term Memory) framework for NLP applications With this topic we plan to develop an general framework which can work with various NLP (Natural Language Processing) applications, such as machine translation, sentiment anlaysis, word disambigution, etc. The technical core is the "LSTM network + Word Vectors", which has been proven to be highly effictive in many cases, including Google Translate. In this topic, you are supposed to create a sequence-to-sequence LSTM model and adjust it into sequence-to-point structure, in order to handle different NLP tasks. We believe by investing time and efforts in this topic, you would not only learn the theories or/and improve your programming skill in specific tasks, but also catch and understand the new trend of the developement in NLP and DL.
  • Adversarial training

    for medical image segmentation applications Medical imaging is an

    important step on diagnosis for surgical or chemical planning. Magnetic

    resonance imaging (MRI) provides rich information for before and during

    treatment to evaluate the treatment and lesion progress. In medical image

    analysis domain, automated lesions segmentation is an important clinical

    diagnostic task and very challenging.  Inspired

    by the promising results achieved by deep learning in many application

    fields, an automated application based on adversarial training is a very

    practical and interesting topic. Currently we have two datasets for

    brain tumor segmentation and Liver tumor segmentation which will be selected and applied in this topic. [6,7]

  • PlaceRecognizer If you like to travel, you most certainly have been at the

    point where you stood somewhere in an unknown city and asked yourself: What

    kind of building is this? What is it for? Who was the architect of this

    building? Well, fear no more! Due to modern computer vision technology we might

    be able to answer these questions for you right on your smartphone!


    In this seminar topic we want to have a look at how to

    create a robust deep learning model that is able to recognize buildings from a

    given image. In order to do this we will need to gather training data (e.g.

    from street view images) and think of a good network architecture and method

    for training such a model. So if you are interested in data gathering, training

    of deep neural networks and maybe also Android Development, this topic is

    perfectly suited for you!


  • Strong interests in video/image processing, machine learning (Deep Learning) and/or computer vision

  • Software development in C/C++ or Python

  • Experience with OpenCV and machine learning applications as a plus



The final evaluation will be based on:

  • Initial implementation / idea presentation, 10%

  • Final presentation, 20%

  • Report/Documentation, 12-18 pages, 30%

  • Implementation, 40%

  • Participation in the seminar (bonus points)


Montag, 13.30-15.00

Room H-2.58

24.04.2017 13:30-15:00

Vorstellung der Themen (PDF)

bis 27.04.2017 

Wahl der Themen  (Anmelden on Doodle)


Bekanntgabe der Themen- und Gruppenzuordnung


Individuelle Meetings mit dem Betreuer


Technologievorträge und geführte Diskussion (je 15+5min)


Präsentation der Endergebnisse (je 15+5min)

bis Mitte August

Abgabe von Implementierung und Dokumentation

bis Ende September

Bewertung der Leistungen