Hasso-Plattner-InstitutSDG am HPI
Hasso-Plattner-InstitutDSG am HPI

Practical Video Analyses (Sommersemester 2015)

Dozent: Prof. Dr. Christoph Meinel (Internet-Technologien und -Systeme) , Dr. Haojin Yang (Internet-Technologien und -Systeme)

Allgemeine Information

  • Semesterwochenstunden: 4
  • ECTS: 6
  • Benotet: Ja
  • Einschreibefrist: 24.04.2015
  • Lehrform: Seminar
  • Belegungsart: Wahlpflichtmodul
  • Maximale Teilnehmerzahl: 12

Studiengänge & Module

IT-Systems Engineering BA
IT-Systems Engineering MA
  • IT-Systems Engineering A
  • IT-Systems Engineering B
  • IT-Systems Engineering C
  • IT-Systems Engineering D


In the last decade digital libraries and web video portals have become more and more popular. The amount of video data available on the World Wide Web (WWW) is growing rapidly. According to the official statistic-report of the popular video portal YouTube  more than 6 billion hours of video are watched each month and about 100 hours of video are uploaded every minute. Therefore, how to efficiently retrieve video data on the web or within large video archives has become a very important and challenging task.

In our current research, we focus on state-of-the-art techniques on video analysis and multimedia information retrieval (MIR). Potential topics include video Shot Boundary Detection (SBD), where a video stream will be separated into a set of representative key-frames. SBD often serves as a basis for further video analysis tasks. Video Text Detection (Video OCR) is one of the most intense research topics in MIR domain. Here we focus on improving existing approaches by using Deep-Learning techniques. Video Genre Classification is another topic attracted much more attention recently. An approach will be developed based on multimodal video information such as video key-frames,  frame concepts, topics from video texts etc. Personal ID document recognition (ID-Card). Face identification etc.

In this seminar, various methods for automatic video analysis and retrieval will be studied and developed.


  • Strong interests in video/image processing, machine learning and/or computer vision

  • Software development in C/C++

  • Experience with OpenCV and machine learning applications as a plus


  • Haojin Yang, Bernhard Quehl and Harald Sack, "A Framework for Improved Video Text Detection and Recognition", International Journal of MULTIMEDIA TOOLS AND APPLICATIONS (MTAP), special issue "Computer Vision for Multimedia", Volume 69 Number 1, pp 217-245. Publicher: Springer US, DOI: http://dx.doi.org/10.1007/s11042-012-1250-6, 2014

  • Epshtein, B.; Ofek, E.; Wexler, Y., "Detecting text in natural scenes with stroke width transform," Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on , vol., no., pp.2963,2970, 13-18 June 2010 doi: 10.1109/CVPR.2010.5540041

  • Tao Wang; Wu, D.J.; Coates, A; Ng, AY., "End-to-end text recognition with convolutional neural networks," Pattern Recognition (ICPR), 2012 21st International Conference on , vol., no., pp.3304,3308, 11-15 Nov. 2012

  • Andrej Karpathy* (Stanford), Sanketh Shetty (Google), George Toderici (Google), Rahul Sukthankar (Google), Thomas Leung (Google), Li Fei-Fei (Stanford University), “Large-scale Video Classification using Convolutional Neural Networks”, Int. Conference on Computer Vision and Pattern Recognition (CVPR ) 2014

  • Sidiropoulos, P.; Mezaris, V.; Kompatsiaris, I; Meinedo, H.; Bugalho, M.; Trancoso, I, "Temporal Video Segmentation to Scenes Using High-Level Audiovisual Features," Circuits and Systems for Video Technology, IEEE Transactions on , vol.21, no.8, pp.1163,1177, Aug. 2011 doi: 10.1109/TCSVT.2011.2138830

  • Kalal, Z.; Mikolajczyk, K.; Matas, J., "Tracking-Learning-Detection," Pattern Analysis and Machine Intelligence, IEEE Transactions on , vol.34, no.7, pp.1409-1422, July 2012 doi: 10.1109/TPAMI.2011.239


The final evaluation will be based on:

  • Initial implementation / idea presentation, 10%

  • Final presentation, 20%

  • Report/Documentation, 12-18 pages, 30%

  • Implementation, 40%

  • Participation in the seminar (bonus points)


Monday, 09.15-10.45

Room A-1.2

13.04.2015 09:15-10:45

Vorstellung der Themen (PDF)

20.04.2015 bis 23:59

Wahl der Themen (Anmelden on Doodle)


Bekanntgabe der Themen- und Gruppenzuordnung


Individuelle Meetings mit dem Betreuer


Technologievorträge und geführte Diskussion (je 15+5min)


Präsentation der Endergebnisse (je 15+5min)

Anfang August

Abgabe von Implementierung und Dokumentation

bis Ende August

Bewertung der Leistungen