Hasso-Plattner-Institut25 Jahre HPI
Hasso-Plattner-Institut25 Jahre HPI

Source Detection in Complex Networks (Sommersemester 2023)

Dozent: Dr. Katharina Baum (Data Analytics and Computational Statistics)
Website zum Kurs: https://moodle.hpi.de/course/view.php?id=444

Allgemeine Information

  • Semesterwochenstunden: 4
  • ECTS: 6
  • Benotet: Ja
  • Einschreibefrist: 01.04.2023 - 07.05.2023
  • Prüfungszeitpunkt §9 (4) BAMA-O: 18.07.2023
  • Lehrform: Seminar
  • Belegungsart: Wahlpflichtmodul
  • Lehrsprache: Englisch
  • Maximale Teilnehmerzahl: 4

Studiengänge, Modulgruppen & Module

Data Engineering MA
IT-Systems Engineering MA
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-K Konzepte und Methoden
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-S Spezialisierung
  • OSIS: Operating Systems & Information Systems Technology
    • HPI-OSIS-T Techniken und Werkzeuge
  • SAMT: Software Architecture & Modeling Technology
    • HPI-SAMT-K Konzepte und Methoden
  • SAMT: Software Architecture & Modeling Technology
    • HPI-SAMT-S Spezialisierung
  • SAMT: Software Architecture & Modeling Technology
    • HPI-SAMT-T Techniken und Werkzeuge
Digital Health MA
Cybersecurity MA
Software Systems Engineering MA


Do you want to trace back a computer virus attack or find patient zero of an epidemic? Do you want to detect where the power grid was disrupted? Do you want to find out who spread the rumor in your social network? Or which of your molecules an unknown substance is acting upon?

All these problems can be formulated as one of source detection in complex graphs. But which algorithm works well for your application? This probably depends on the type of network as well as the type of information spread, and whether you need short runtime or high-quality results. Let’s find out together!

In this software project, different teams will work on the goal of implementing and benchmarking methods of source detection (see Fig. 1). We will have teams that implement existing supervised and unsupervised methods, come up with new algorithms, and create benchmarking datasets and metrics. We aim to make our methods fit for applying on real-world problems and prepare a publication draft.


In this project, in addition to tackling an interesting and complex prediction problem, you will learn how to consider interfaces between teams, train organizing yourselves, coding scientifically and reproducibly, and interacting with scientific literature.  


  • Good programming knowledge in Python or another programming language is absolutely required. You should be able to – independently – re-implement and apply a rather complex method following the description in a research publication and establish running, well-documented code for data preprocessing and analysis.
  • Some practical experiences with result visualization, and knowledge on descriptive statistics is beneficial for exploring your results and working with real-world data.
  • You should have good command of English in writing as well as orally.  
  • Basic knowledge of graphs and network analysis is beneficial, but it will be possible to refresh/learn them on the way.


Examples for source detection algorithms

1.         Feizi, S., et al., Network Infusion to Infer Information Sources in Networks. IEEE Transactions on Network Science and Engineering, 2019. 6(3): p. 402-417.

2.         Paluch, R., et al., Fast and accurate detection of spread source in large complex networks. Scientific Reports, 2018. 8(1): p. 2508.

3.         Shah, D. and T. Zaman, Detecting sources of computer viruses in networks: theory and experiment. SIGMETRICS Perform. Eval. Rev., 2010. 38(1): p. 203–214.


Biological graphs and data as real-world example

4.         Harush, U. and B. Barzel, Dynamic patterns of information flow in complex networks. Nature Communications, 2017. 8(1): p. 2181.

5.         Subramanian, A., et al., A Next Generation Connectivity Map: L1000 Platform and the First 1,000,000 Profiles. Cell, 2017. 171(6): p. 1437-1452.e17.

Lern- und Lehrformen

The majority of the seminar will consist of hands-on project work that includes programming, data preparation and analysis, result interpretation, visualization, and reporting; you are allowed to work in pairs on your project.

The first two meetings will be held in a lecture-like format to recapitulate relevant basics. Teams will then form and choose their preferred subproblem. We will meet weekly with all teams to discuss about progress and common interfaces for data and information transfer between teams, these meetings will have a highly interactive character. Additional project-specific meetings are possible on demand. During the last meetings, you will present your project and its results in a final talk that covers your whole analysis, and you will be asked to hand in a written report as well as your documented code.

­­­­­We plan to offer the seminar in a hybrid format, i.e., we plan to be in the lecture halls for the weekly meetings and make live dialing-in via zoom available. We will record the first sessions. Please subscribe to the seminar’s moodle that we will also use to share the zoom link, relevant information such as the planned timeline for the seminar, slides etc. [https://moodle.hpi.de/course/view.php?id=444].


Regular participation in the progress meetings are required to pass the course. At the end of the lecture period, you will give a talk covering your whole project. We will ask you to hand in your documented code and a final written report at the end of the semester. The final grade will be derived by:

  1. Oral presentation of the final results of your project (45%)
  2. Quality of written final report and code (55%)