Dynamic Programming and Reinforcement Learning (Sommersemester 2023)
Dozent:
Dr. Rainer Schlosser
(Enterprise Platform and Integration Concepts)
,
Alexander Kastius
Website zum Kurs:
https://hpi.de/herbrich/teaching/dynamic-programming-and-reinforcement-learning.html
Allgemeine Information
- Semesterwochenstunden: 4
- ECTS: 6
- Benotet:
Ja
- Einschreibefrist: 01.04.2023 - 07.05.2023
- Prüfungszeitpunkt §9 (4) BAMA-O: 18.07.2023
- Lehrform: Seminar / Übung
- Belegungsart: Wahlpflichtmodul
- Lehrsprache: Englisch
Studiengänge, Modulgruppen & Module
- BPET: Business Process & Enterprise Technologies
- HPI-BPET-K Konzepte und Methoden
- BPET: Business Process & Enterprise Technologies
- HPI-BPET-S Spezialisierung
- BPET: Business Process & Enterprise Technologies
- HPI-BPET-T Techniken und Werkzeuge
- OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-K Konzepte und Methoden
- OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-S Spezialisierung
- OSIS: Operating Systems & Information Systems Technology
- HPI-OSIS-T Techniken und Werkzeuge
- DANA: Data Analytics
- HPI-DANA-K Konzepte und Methoden
- DANA: Data Analytics
- HPI-DANA-T Techniken und Werkzeuge
- DANA: Data Analytics
- HPI-DANA-S Spezialisierung
- SCAD: Scalable Computing and Algorithms for Digital Health
- HPI-SCAD-C Concepts and Methods
- SCAD: Scalable Computing and Algorithms for Digital Health
- HPI-SCAD-T Technologies and Tools
- SCAD: Scalable Computing and Algorithms for Digital Health
- HPI-SCAD-S Specialization
- DICR: Digitalization of Clinical and Research Processes
- HPI-DICR-C Concepts and Methods
- DICR: Digitalization of Clinical and Research Processes
- HPI-DICR-T Technologies and Tools
- DICR: Digitalization of Clinical and Research Processes
- HPI-DICR-S Specialization
- HPI-SSE-C Conceptual Foundations
- HPI-SSE-D Data Foundations
- DSYS: Data-Driven Systems
- HPI-DSYS-C Concepts and Methods
- DSYS: Data-Driven Systems
- HPI-DSYS-T Technologies and Tools
- DSYS: Data-Driven Systems
- HPI-DSYS-S Specialization
- MALA: Machine Learning and Analytics
- HPI-MALA-C Concepts and Methods
- MALA: Machine Learning and Analytics
- HPI-MALA-T Technologies and Tools
- MALA: Machine Learning and Analytics
- HPI-MALA-S Specialization
- MODA: Models and Algorithms
- HPI-MODA-C Concepts and Methods
- MODA: Models and Algorithms
- HPI-MODA-T Technologies and Tools
- MODA: Models and Algorithms
- HPI-MODA-S Specialization
Beschreibung
The need for automated decision-making is steadily increasing. Hence, data-driven decision-making techniques are essential. We assume a system that follows certain dynamics and has to be tuned or controlled over time such that certain constraints are satisfied and a specified objective is optimized. Typically, the current state of the system as well as the interplay of rewards and potential future states associated to certain actions have to be taken into account. The dynamics and state transitions may have to be estimated from data using suitable ML-based techniques.
As, in general, exact solution approaches of such dynamic optimization problems do not scale often heuristics have to be used (e.g., in case the number of states is too large, cf. curse of dimensionality). Besides classical approaches such as dynamic programming (DP) state-of-the-art heuristic optimization techniques such as approximate dynamic programming (ADP) or reinforcement learning (RL) are suitable alternatives.
Goals of the Course
Understand...
- opportunities and challenges of decision-making
- static deterministic problems
- stochastic dynamic problems
- optimization models and solution techniques
Do ...
- work in small teams
- set up suitable models, apply optimization techniques
- simulate controlled processes, compare performance results
Improve/Learn ...
- mathematical, analytical, and modelling skills
- optimization techniques
- dynamic programming methods
- reinforcement learning methods
Voraussetzungen
- interest in quantitative methods and stochastics
- programming skills/experience
- the number of participants is not restricted
Literatur
Lern- und Lehrformen
The course is a combination of a lecture and a practical part:
- teachers impart relevant knowledge and methods
- students work on a self-containing topic in a team of ca. 3 people
- students present and document their work
Leistungserfassung
Portfolio assessment for ITSE, DE, and DH-students consisting of:
- (i) final presentation of project results (July 18)
- (ii) project documentation at the end of the module (Sep 15)
Zurück