Hasso-Plattner-Institut25 Jahre HPI
Hasso-Plattner-Institut25 Jahre HPI
 

Techniken & Algorithmen zur Bildatenprozessierung (Sommersemester 2023)

Dozent: Dr. Matthias Trapp (Computergrafische Systeme) , Max Reimann (Computergrafische Systeme) , Dr. Rico Richter (Computergrafische Systeme) , Ole Wegen (Computergrafische Systeme)

Allgemeine Information

  • Semesterwochenstunden: 4
  • ECTS: 6
  • Benotet: Ja
  • Einschreibefrist: 01.04.2023 - 07.05.2023
  • Lehrform: Seminar / Projekt
  • Belegungsart: Wahlpflichtmodul
  • Lehrsprache: Deutsch

Studiengänge, Modulgruppen & Module

IT-Systems Engineering BA

Beschreibung

This project seminar aims at Bachelors students who wish to build upon fundamental image/video processing, computer vision, and computer graphics skills for the design, development and deployment of GPU-accelerated image and video processing techniques, for use on mobile, desktop, and server systems. A short video showcasing results of recent courses can be found here: https://youtu.be/YNgGWarBFEY.

 

 

The course has mainly a project character and is subdivided into two parts:

The first part of the course is organized as a lecture series. The lecture topics are specified together with the seminar students and can include an introduction to the following basic concepts and foundations to:

  • A short introduction into the field of image and video analytics,
  • Techniques for image and video processing,
  • Application development for mobile and Desktop/Server systems

Using specific image and video processing operations, the course teaches how advanced image/video analysis techniques can be designed, developed, and tested.

In the second part of the course, participants will work individually, or as a team (max. 2 members), to implement assigned topics in the field of interactive image and video processing. For all target systems, we offer middleware for development, which can be used. For example, a C++ Framework for Desktop applications, an Android and iOS framework for mobile applications, and JS (Angular, Node framework) or Python (FastAPI framework) for service-based browser-applications will be provided. Topics for this project seminar cover the following domains (not limited to):

 

  • Convolutional Neural Networks for image analysis and transformation*.
  • LSTM and Attention-based networks for sequence modeling of videos.
  • Image and video processing for VR (Virtual Reality) and AR (Augmented Reality) applications.
  • Generative models (GANs, diffusion models) for image/video generation.
  • Web-based image processing using WebGPU or WebGL.
  • Integration of interactive rendering techniques in 3rd party applications. 
  • Implementation of interactive image stylization and editing tools for desktop systems.
  • Service-based image and video-processing*.
  • Web-app development for service-based image- and video processing.
  • Integration of deep learning frameworks into visual computing pipelines for videos.
  • Implementing effects for visual media abstraction*.
  • Automated video summarization approaches to efficiently and effectively shorten videos*:
    • Shot boundary detection using neural networks (Eg: TransNetV2)
    • Scene boundary segmentation using neural networks (Eg: SceneSeg)
    • Image/video captioning using deep learning (Eg: CLIP)
    • Multimodal video analysis (Eg: movienet-tools)
    • Query-based image and video retrieval approaches for video summarization (Eg: CLIP embedding based retrieval)
    • Efficient deep learning based video classification models that can run on mobile devices (Eg: MoViNets)

Topics marked by a * are related to a joint research project of the Hasso Plattner Institute with Digital Masterpieces and the German Federal Ministry of Education, which investigates new concepts and techniques for multidimensional video processing and automatic video abstraction.

Voraussetzungen

  • Successful completion of the lectures Computer Graphics I and/or II
  • Basic knowledge of OpenGL (ES) Shading Languages or Metal Shading Language for image and video processing topics
  • Basic knowledge/understanding of Neural Networks and/or ComputerVision algorithms for image and video analysis topics
  • For Service/WebApps development: basic knowledge/understanding of Angular, Node.js, JavaScript, and Docker
  • For Android mobile development: basic knowledge of Java programming language
  • For iOS development: basic knowledge of Swift development
  • For Desktop development: basic knowledge of C++ development   

Literatur

  • C++11/C++14 reference: Stroustrup, Programming: Principles and Practice Using C++
  • JS reference: Haverbeke, Eloquent Javascript (3rd edition)
  • Deep learning references:
    • General; Glassner, Deep Learning: A Visual Approach
    • PyTorch; Stevens et al., Deep Learning with PyTorch
    • Tensorflow 2.0; Géron, Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems (2nd edition)
  • Computer vision references:
    • General; Klette; Concise Computer Vision: An Introduction into Theory and Algorithms
    • 3D vision; Hartley and Zisserman; Multiple View Geometry in Computer Vision (2nd edition)
    • OpenCV 4; Howse and Minichino; Learning OpenCV 4 Computer Vision with Python 3 (3rd edition)
  • Topic-specific material will be provided throughout the course

Lern- und Lehrformen

Project seminar (4 SWS/6 ECTS)

Leistungserfassung

The final grade will be determined as follows:

  • 50% Documented source code & prototypical application
  • 15% Concept presentation (approx. 10 minutes)
  • 25% Final presentation (approx. 25 minutes)
  • 10 % Projectmanagement 

Termine

The seminar topics will be presented in the kick-off meeting. This meeting implemented on-site and via Zoom.us. The kick-off meeting will be on Wednesday, 19.04.2023, 13:30 - 15:00 in A-1.2

The rest of the seminar is organized as follows: 

  • The individual topics are assigned not later then 30.04.2022. After topic assignment, the project phase will kick off. 
  • The project part will start in a self-organized way. Appointments with the supervisor are coordinated with the individual supervisors.
  • The midterm presentation will take place in the week from 14.06.-25.06.2023.
  • Based on student’s voting, the final presentation will take place September 2023.

Please note: In order to participate in the zoom meeting, please register in the respective moodle lecture: https://moodle.hpi.de/course/view.php?id=436

Zurück