Hasso-Plattner-InstitutSDG am HPI
Hasso-Plattner-InstitutDSG am HPI
Login
 

Implementing information flow in complex networks (Wintersemester 2021/2022)

Lecturer: Dr. Katharina Baum (Data Analytics and Computational Statistics) , Pauline Hiort (Data Analytics and Computational Statistics)
Course Website: https://moodle.hpi.de/enrol/index.php?id=216

General Information

  • Weekly Hours: 4
  • Credits: 6
  • Graded: yes
  • Enrolment Deadline: 01.10.2021 - 22.10.2021
  • Teaching Form: Seminar
  • Enrolment Type: Compulsory Elective Module
  • Course Language: English
  • Maximum number of participants: 8

Programs, Module Groups & Modules

Digital Health MA
  • APAD: Acquisition, Processing and Analysis of Health Data
    • HPI-APAD-C Concepts and Methods
  • APAD: Acquisition, Processing and Analysis of Health Data
    • HPI-APAD-T Technologies and Tools
  • APAD: Acquisition, Processing and Analysis of Health Data
    • HPI-APAD-S Specialization
Data Engineering MA
IT-Systems Engineering MA

Description

The world is complex, and so is its data. A large part of its complexity stems from existing relationships between entities that require a representation of data in networks (graphs). Frequently, the connections between entities in a network represent channels along which some type of information or signal can be relayed, e.g. physical interactions between individuals during which news are exchanged, molecular interactions that enable a biological system to process and respond adequately to a stimulus, or roads between locations along which goods can be exchanged. By modeling these processes as flow of information along network links, complex system dynamics in various real-world networks can be traced and studied, such as spread of rumours, news or diseases in social networks, perturbations from drugs within molecular networks, or optimal supply routes in road networks. 

Please note that the seminar does not focus on classical information flow as in information theory, i.e. encoding, decoding and the capacity of message relay in communication networks, but of course some of the principles are related. Instead, this seminar focuses on hands-on data analysis where we aim to derive meaningful interpretations for real-world scenarios from various types of information flow analyses in networks. The goal is to implement recent computational methods and to use these implementations for specific analyses and interpretations of results. Potential methods for representing and analysing information flow that we could explore are, for example, network diffusion with different transition dynamics, distance-based influence representation, node embeddings by random walks, or the inverse problem of tracing network structure or signal sources from snapshots of system states. As we plan to work with real-world applications, we will dive into data preparation and preprocessing, and combine data from different sources. 

We will first introduce or recapitulate some basics on graphs, networks and how data can be analyzed in a network context, and we will give an overview on different flavors of information flow in networks. You will then choose a specific approach to follow up from a pool of recent research papers (or on request and if appropriate a similar paper of your own choice). Depending on the topic, you can decide on appropriate datasets that are interesting to you, and decide on the focus and direction of the analysis you want to perform. Finally, you will implement and apply the method of your choice. Thereby, you will assess the method’s utility for the task, and interpret and visualize the results.

Learning Objectives:

  • You will learn methods for preparing and combining data from different sources, and interpreting and analyzing them in a network context. 
  • You will learn how to interpret (multiple) and implement (at least one) approach(es) of information flow in a complex network and how to apply it for the analysis of a real-world scenario.
  • Your ability to critically interact with research publications and to find and consult secondary literature will be trained.
  • You will train how to organize work in a small group of two, how to present and visualize your results scientifically, both orally and written.

Requirements

  • Good programming knowledge in Python or another programming language is absolutely required. You should be able to – independently – re-implement and apply a rather complex method following the description in a research publication and establish running, well-documented code for data preprocessing and analysis.
  • Due to the focus on data analysis, some practical experiences with result visualization, and knowledge on descriptive statistics is beneficial.
  • You should have good command of English in writing as well as orally.  
  • Basic knowledge of graphs and network analysis is beneficial, but it will be possible to refresh/learn them on the way.

Literature

  1. Chittaranjan Hens & Uzi Harush., et al. Spatiotemporal signal propagation in complex networks. Nature Physics 2019;15(4):403-412.
  2. Feixiong Cheng & István Kovács, et al. Network-based prediction of drug combinations. Nature Communications 2019; 10:1197.
  3. Grover A, Leskovec J. node2vec: Scalable Feature Learning for Networks. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2016. p. 855–64.

Learning

The majority of the seminar will consist of hands-on project work that includes programming, data preparation and analysis, result interpretation, visualization, and reporting; you are allowed to work in pairs on your project.

First meetings will be held in a lecture-like format to recapitulate relevant basics of graph and network analysis, introduce information flow concepts, as well as best practice on data preparation and preprocessing. We will give a short overview on potential project topics. After each group has settled on a topic, you will present your project ideas and establish a project timeline. Subsequent regular, weekly meetings with all groups will serve for short updates on your project status and will have a highly interactive character. There will be at least two mandatory additional meetings for in-depth discussion of data preparation, content and timeline for each single project group with the lecturing team (via zoom or in-person). Additional project-specific meetings are possible on demand. During the last meetings, you will present your project and its results in a final talk that covers your whole analysis, and you will be asked to hand in a written report as well as your documented code.

­­­­­We plan to offer the seminar in a hybrid format, i.e., we plan to be in the lecture halls for the weekly meetings and make live dialing-in via zoom available. We will record the sessions whenever possible. 

Please subscribe to the seminar’s moodle that we will also use to share the zoom link, relevant information such as the planned timeline for the seminar, lecture slides etc.

moodle.hpi.de/course/view.php

Examination

Regular participation in the progress meetings and the two in-group meetings with the lecturing team are required to pass the course, as well as the submission of a short project proposal outlining the scope and timeline for your analysis (ungraded). After familiarizing with the project, you will present your ideas in a short introductory talk. At the end of the lecture period, you will give a talk covering your whole project. We will ask you to hand in your documented code and a final written report at the end of the semester. The final grade will be derived by:

  1. Oral presentation as introduction to the project (15%)
  2. Oral presentation of the final results of your project (30%)
  3. Quality of written final report and code (55%)

Dates

Fridays 9:15-10:45, starting from November 1 (due to the general assembly on October 25) in A1.2, or via zoom (see the course’s Moodle for the link and for additional course information, https://moodle.hpi.de/course/view.php?id=216).

First grading: December 17, 2021 (opt-out December 9, 2021)

Zurück