Introduction to Image and Video Processing Techniques (Wintersemester 2022/2023)
Dozent:
Dr. Matthias Trapp
(Computergrafische Systeme)
,
Max Reimann
(Computergrafische Systeme)
,
Wattasseril Jobin Indiculla
Allgemeine Information
- Semesterwochenstunden: 4
- ECTS: 6
- Benotet:
Ja
- Einschreibefrist: 01.10.2022 - 31.10.2022
- Prüfungszeitpunkt §9 (4) BAMA-O: 15.12.2022
- Lehrform: Seminar / Projekt
- Belegungsart: Wahlpflichtmodul
- Lehrsprache: Deutsch
- Maximale Teilnehmerzahl: 15
Studiengänge, Modulgruppen & Module
- HCGT: Human Computer Interaction & Computer Graphics Technology
- HCGT: Human Computer Interaction & Computer Graphics Technology
- ISAE: Internet, Security & Algorithm Engineering
- SAMT: Software Architecture & Modeling Technology
- OSIS: Operating Systems & Information Systems Technology
Beschreibung
This project seminar aims at Bachelors students who wish to build upon fundamental image/video processing, computer vision, and computer graphics skills for the design, development and deployment of GPU-accelerated image and video processing techniques, for use on mobile, desktop, and server systems. A short video showcasing results of recent courses can be found here: https://youtu.be/YNgGWarBFEY.
The course has mainly a project character and is subdivided into two parts:
The first part of the course is organized as a lecture series. The lecture topics are specified together with the seminar students and can include an introduction to the following basic concepts and foundations to:
- A short introduction into the field of image and video analytics,
- Techniques for image and video processing,
- Application development for mobile and Desktop/Server systems
Using specific image and video processing operations, the course teaches how advanced image/video analysis techniques can be designed, developed, and tested.
In the second part of the course, participants will work individually, or as a team (max. 2 members), to implement assigned topics in the field of interactive image and video processing. For all target systems, we offer middleware for development, which can be used. For example, a C++ Framework for Desktop applications, an Android and iOS framework for mobile applications, and JS (Angular, Node framework) or Python (FastAPI framework) for service-based browser-applications will be provided. Topics for this project seminar cover the following domains (not limited to):
- Convolutional Neural Networks for image analysis and transformation.
- LSTM and Attention-based networks for sequence modeling of videos.
- Image and video processing for VR (Virtual Reality) and AR (Augmented Reality) applications.
- Generative models (GANs, diffusion models) for image/video generation.
- Web-based image processing using WebGPU or WebGL.
- Integration of interactive rendering techniques in 3rd party applications.
- Implementation of interactive image stylization and editing tools for desktop systems.
- Service-based image and video-processing.
- Web-app development for service-based image- and video processing.
- Integration of deep learning frameworks into visual computing pipelines for videos.
- Implementing effects for visual media abstraction.
- Automated video summarization approaches to efficiently and effectively shorten videos:
- Shot boundary detection using neural networks (Eg: TransNetV2)
- Scene boundary segmentation using neural networks (Eg: SceneSeg)
- Image/video captioning using deep learning (Eg: CLIP)
- Multimodal video analysis (Eg: movienet-tools)
- Query-based image and video retrieval approaches for video summarization (Eg: CLIP embedding based retrieval)
- Efficient deep learning based video classification models that can run on mobile devices (Eg: MoViNets)
Voraussetzungen
- Successful completion of the lectures Computer Graphics I and/or II
- Basic knowledge of OpenGL (ES) Shading Languages or Metal Shading Language for image and video processing topics
- Basic knowledge/understanding of Neural Networks and/or ComputerVision algorithms for image and video analysis topics
- For Service/WebApps development: basic knowledge/understanding of Angular, Node.js, JavaScript, and Docker
- For Android mobile development: basic knowledge of Java programming language
- For iOS development: basic knowledge of Swift development
- For Desktop development: basic knowledge of C++ development
Literatur
- C++11/C++14 reference: Stroustrup, Programming: Principles and Practice Using C++
- JS reference: Haverbeke, Eloquent Javascript (3rd edition)
- Deep learning references:
- General; Glassner, Deep Learning: A Visual Approach
- PyTorch; Stevens et al., Deep Learning with PyTorch
- Tensorflow 2.0; Géron, Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems (2nd edition)
- Computer vision references:
- General; Klette; Concise Computer Vision: An Introduction into Theory and Algorithms
- 3D vision; Hartley and Zisserman; Multiple View Geometry in Computer Vision (2nd edition)
- OpenCV 4; Howse and Minichino; Learning OpenCV 4 Computer Vision with Python 3 (3rd edition)
- Topic-specific material will be provided throughout the course
Lern- und Lehrformen
Project seminar (4 SWS/6 ECTS)
Leistungserfassung
The final grade will be determined as follows:
- 50% Documented source code & prototypical application
- 15% Concept presentation (approx. 10 minutes)
- 25% Final presentation (approx. 25 minutes)
- 10 % Projectmanagement
Termine
The seminar topics will be presented in the kick-off meeting. This meeting implemented on-site and via Zoom.us. The kick-off meeting will be on Thursday, 20.10.2022, 9:15- 10:45 in A-1.1
The rest of the seminar is organized as follows:
- The individual topics are assigned not later then 26.10.2022. After topic assignment, the project phase will kick off.
- The project part will start in a self-organized way. Appointments with the supervisor are coordinated with the individual supervisors.
- The midterm presentation will take place Thur., 15.12.2022, 9:15, in A-1.1
- Based on student’s voting, the final presentation will take place in April 2022.
Please note: In order to participate in the zoom meeting, please register in the respective moodle lecture: https://moodle.hpi.de/course/view.php?id=356
Zurück