Hasso-Plattner-Institut20 Jahre HPI
Hasso-Plattner-Institut20 Jahre HPI
  
Login
 

Machine Learning for Data Streams (Wintersemester 2019/2020)

Dozent: Dr. Alexander Albrecht (Information Systems) , Dr. Thorsten Papenbrock (Information Systems)

Allgemeine Information

  • Semesterwochenstunden: 4
  • ECTS: 6
  • Benotet: Ja
  • Einschreibefrist: 30.10.2019
  • Lehrform: Seminar
  • Belegungsart: Wahlpflichtmodul
  • Lehrsprache: Englisch

Studiengänge & Module

IT-Systems Engineering MA
  • ITSE-Analyse
  • ITSE-Entwurf
  • ITSE-Konstruktion
  • ITSE-Maintenance
  • OSIS-Konzepte und Methoden
  • OSIS-Techniken und Werkzeuge
  • OSIS-Spezialisierung
Data Engineering MA

Beschreibung

In this seminar, we study novel algorithms that learn from data streams.

Traditional machine learning algorithms are rarely applicable in scenarios with streaming data. Most algorithms were designed for offline settings, i.e., the entire data set needs to be scanned and processed (multiple times), before a decision can be made.

In this seminar, students will implement, evaluate (and at best improve) machine learning algorithms for data streams from current research projects. We will look at algorithms for classification, regression, clustering, pattern mining, outlier detection, trend detection and recommender systems.

Each team, consisting of two students, chooses and presents a challenging research task and implements the proposed solution as research prototype using the streaming framework Apache Kafka with Kafka Streams.

This is a project seminar: There will be a few weekly lectures including an introductory lecture and an invited talk from industry about Stream Processing with Apache Kafka. Teams will frequently meet with the supervisor.

Leistungserfassung

In teams, with team size is two students, you will be completing the following tasks:

  • Active participation during all seminar events.
  • Short presentation of the selected research paper.
  • Intermediate presentations demonstrating insights regarding your research prototype.
  • Regular meetings with advisor.
  • Implementation of a research prototype with Kafka and Kafka Streams.
  • Final presentation demonstrating your solution.
  • Code & documentation (on GitHub). The documentation should contain information
  • on how to execute and evaluate your solution. Furthermore, it should also show
  • strengths and weaknesses of the implementation.

Zurück