Hasso-Plattner-Institut25 Jahre HPI
Hasso-Plattner-Institut25 Jahre HPI
 

Introduction to Image and Video Processing Techniques (Wintersemester 2022/2023)

Lecturer: Dr. Matthias Trapp (Computergrafische Systeme) , Max Reimann (Computergrafische Systeme) , Wattasseril Jobin Indiculla

General Information

  • Weekly Hours: 4
  • Credits: 6
  • Graded: yes
  • Enrolment Deadline: 01.10.2022 - 31.10.2022
  • Examination time §9 (4) BAMA-O: 15.12.2022
  • Teaching Form: Seminar / Project
  • Enrolment Type: Compulsory Elective Module
  • Course Language: German
  • Maximum number of participants: 15

Programs, Module Groups & Modules

IT-Systems Engineering BA

Description

This project seminar aims at Bachelors students who wish to build upon fundamental image/video processing, computer vision, and computer graphics skills for the design, development and deployment of GPU-accelerated image and video processing techniques, for use on mobile, desktop, and server systems. A short video showcasing results of recent courses can be found here: https://youtu.be/YNgGWarBFEY.

 

 

The course has mainly a project character and is subdivided into two parts:

The first part of the course is organized as a lecture series. The lecture topics are specified together with the seminar students and can include an introduction to the following basic concepts and foundations to:

  • A short introduction into the field of image and video analytics,
  • Techniques for image and video processing,
  • Application development for mobile and Desktop/Server systems

Using specific image and video processing operations, the course teaches how advanced image/video analysis techniques can be designed, developed, and tested.

In the second part of the course, participants will work individually, or as a team (max. 2 members), to implement assigned topics in the field of interactive image and video processing. For all target systems, we offer middleware for development, which can be used. For example, a C++ Framework for Desktop applications, an Android and iOS framework for mobile applications, and JS (Angular, Node framework) or Python (FastAPI framework) for service-based browser-applications will be provided. Topics for this project seminar cover the following domains (not limited to):

 

  • Convolutional Neural Networks for image analysis and transformation.
  • LSTM and Attention-based networks for sequence modeling of videos.
  • Image and video processing for VR (Virtual Reality) and AR (Augmented Reality) applications.
  • Generative models (GANs, diffusion models) for image/video generation.
  • Web-based image processing using WebGPU or WebGL.
  • Integration of interactive rendering techniques in 3rd party applications. 
  • Implementation of interactive image stylization and editing tools for desktop systems.
  • Service-based image and video-processing.
  • Web-app development for service-based image- and video processing.
  • Integration of deep learning frameworks into visual computing pipelines for videos.
  • Implementing effects for visual media abstraction.
  • Automated video summarization approaches to efficiently and effectively shorten videos:
    • Shot boundary detection using neural networks (Eg: TransNetV2)
    • Scene boundary segmentation using neural networks (Eg: SceneSeg)
    • Image/video captioning using deep learning (Eg: CLIP)
    • Multimodal video analysis (Eg: movienet-tools)
    • Query-based image and video retrieval approaches for video summarization (Eg: CLIP embedding based retrieval)
    • Efficient deep learning based video classification models that can run on mobile devices (Eg: MoViNets)

Requirements

  • Successful completion of the lectures Computer Graphics I and/or II
  • Basic knowledge of OpenGL (ES) Shading Languages or Metal Shading Language for image and video processing topics
  • Basic knowledge/understanding of Neural Networks and/or ComputerVision algorithms for image and video analysis topics
  • For Service/WebApps development: basic knowledge/understanding of Angular, Node.js, JavaScript, and Docker
  • For Android mobile development: basic knowledge of Java programming language
  • For iOS development: basic knowledge of Swift development
  • For Desktop development: basic knowledge of C++ development   

Literature

  • C++11/C++14 reference: Stroustrup, Programming: Principles and Practice Using C++
  • JS reference: Haverbeke, Eloquent Javascript (3rd edition)
  • Deep learning references:
    • General; Glassner, Deep Learning: A Visual Approach
    • PyTorch; Stevens et al., Deep Learning with PyTorch
    • Tensorflow 2.0; Géron, Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems (2nd edition)
  • Computer vision references:
    • General; Klette; Concise Computer Vision: An Introduction into Theory and Algorithms
    • 3D vision; Hartley and Zisserman; Multiple View Geometry in Computer Vision (2nd edition)
    • OpenCV 4; Howse and Minichino; Learning OpenCV 4 Computer Vision with Python 3 (3rd edition)
  • Topic-specific material will be provided throughout the course

Learning

Project seminar (4 SWS/6 ECTS)

Examination

The final grade will be determined as follows:

  • 50% Documented source code & prototypical application
  • 15% Concept presentation (approx. 10 minutes)
  • 25% Final presentation (approx. 25 minutes)
  • 10 % Projectmanagement 

Dates

The seminar topics will be presented in the kick-off meeting. This meeting implemented on-site and via Zoom.us. The kick-off meeting will be on Thursday, 20.10.2022, 9:15- 10:45 in A-1.1

The rest of the seminar is organized as follows: 

  • The individual topics are assigned not later then 26.10.2022. After topic assignment, the project phase will kick off. 
  • The project part will start in a self-organized way. Appointments with the supervisor are coordinated with the individual supervisors.
  • The midterm presentation will take place Thur., 15.12.2022, 9:15, in A-1.1
  • Based on student’s voting, the final presentation will take place in April 2022.

Please note: In order to participate in the zoom meeting, please register in the respective moodle lecture: https://moodle.hpi.de/course/view.php?id=356

Zurück