Practical Video Analyses (Sommersemester 2017)
Lecturer:
Dr. Haojin Yang
(Internet-Technologien und -Systeme)
General Information
- Weekly Hours: 4
- Credits: 6
- Graded:
yes
- Enrolment Deadline: 28.04.2017
- Teaching Form: Seminar
- Enrolment Type: Compulsory Elective Module
- Maximum number of participants: 12
Programs, Module Groups & Modules
- ISAE: Internet, Security & Algorithm Engineering
- HPI-ISAE-S Spezialisierung
- ISAE: Internet, Security & Algorithm Engineering
- HPI-ISAE-T Techniken und Werkzeuge
- OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-K Konzepte und Methoden
- OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-S Spezialisierung
- OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-T Techniken und Werkzeuge
- IT-Systems Engineering
- IT-Systems Engineering
- IT-Systems Engineering
- IT-Systems Engineering
Description
In the last decade digital libraries and web video portals have become more and more popular. The amount of video data available on the World Wide Web (WWW) is growing rapidly. According to the official statistic-report of the popular video portal YouTube more than 400 hours of video are uploaded every minute. Therefore, how to efficiently retrieve video data on the web or within large video archives has become a very important and challenging task.
In our current research we focus on video analysis and multimedia information retrieval (MIR) by using Deep-Learning techniques. Deep Learning (DL), as a new area of machine learning (since 2006), has already been impacting a wide range of multimedia information processing. Recently, the techniques developed based on DL achieved substantial progress in fields including Computer Vision, Speech Recognition, Image Classification and NLP etc.
Topics in this seminar:
- A general LSTM (Long Short Term Memory) framework for NLP applications With this topic we plan to develop an general framework which can work with various NLP (Natural Language Processing) applications, such as machine translation, sentiment anlaysis, word disambigution, etc. The technical core is the "LSTM network + Word Vectors", which has been proven to be highly effictive in many cases, including Google Translate. In this topic, you are supposed to create a sequence-to-sequence LSTM model and adjust it into sequence-to-point structure, in order to handle different NLP tasks. We believe by investing time and efforts in this topic, you would not only learn the theories or/and improve your programming skill in specific tasks, but also catch and understand the new trend of the developement in NLP and DL.
- Adversarial training
for medical image segmentation applications Medical imaging is an
important step on diagnosis for surgical or chemical planning. Magnetic
resonance imaging (MRI) provides rich information for before and during
treatment to evaluate the treatment and lesion progress. In medical image
analysis domain, automated lesions segmentation is an important clinical
diagnostic task and very challenging. Inspired
by the promising results achieved by deep learning in many application
fields, an automated application based on adversarial training is a very
practical and interesting topic. Currently we have two datasets for
brain tumor segmentation and Liver tumor segmentation which will be selected and applied in this topic. [6,7]
- PlaceRecognizer If you like to travel, you most certainly have been at the
point where you stood somewhere in an unknown city and asked yourself: What
kind of building is this? What is it for? Who was the architect of this
building? Well, fear no more! Due to modern computer vision technology we might
be able to answer these questions for you right on your smartphone!
In this seminar topic we want to have a look at how to
create a robust deep learning model that is able to recognize buildings from a
given image. In order to do this we will need to gather training data (e.g.
from street view images) and think of a good network architecture and method
for training such a model. So if you are interested in data gathering, training
of deep neural networks and maybe also Android Development, this topic is
perfectly suited for you!
Requirements
Strong interests in video/image processing, machine learning (Deep Learning) and/or computer vision
Software development in C/C++ or Python
- Experience with OpenCV and machine learning applications as a plus
Literature
Examination
The final evaluation will be based on:
Initial implementation / idea presentation, 10%
Final presentation, 20%
Report/Documentation, 12-18 pages, 30%
Implementation, 40%
- Participation in the seminar (bonus points)
Dates
Montag, 13.30-15.00
Room H-2.58
24.04.2017 13:30-15:00 | Vorstellung der Themen (PDF) |
bis 27.04.2017 | Wahl der Themen (Anmelden on Doodle) |
28.04.2016 | Bekanntgabe der Themen- und Gruppenzuordnung |
wöchentlich | Individuelle Meetings mit dem Betreuer |
29.05.2017 | Technologievorträge und geführte Diskussion (je 15+5min) |
24.07.2017 | Präsentation der Endergebnisse (je 15+5min) |
bis Mitte August | Abgabe von Implementierung und Dokumentation |
bis Ende September | Bewertung der Leistungen |
Zurück