Hasso-Plattner-Institut
Prof. Dr. h.c. Hasso Plattner
  
 

Data-Driven Causal Inference Research Seminar

General Information

About this Seminar

In this seminar, we invite students that are interested in working on research-related topics in the area of data-driven causal inference. In particular, we focus on causal structure learning (CSL) in heterogeneous settings as well as on parallel execution strategies for causal structure learning in GPU-accelerated environments. A brief introduction to our research is found on our project page.

Logistics

  • In the first meeting, Tue 04/13/2021 at 9:15, we will present the different research-related topics.
  • The meetings will be held online.

 

Topics

This list of topics is not exhaustive and we are happy to discuss research projects based on your previous experience and personal interests.

  • Learning causal structures using multi-GPU environments: While causal structure learning can benefit from the execution on a single GPU, it has not been adopted in multi-GPU environments. In this setting, smart task distribution during execution is extremely relevant to avoid any performance bottlenecks, such as page faults or memory thrashing. Hence, a well-designed multi-GPU aware causal structure learning algorithm is required to use the hardware's full potential.
  • Advanced out-of-core GPU-accelerated causal structure learning: For very high dimensional data, the global memory on a single GPU becomes a limiting factor to the execution of causal structure learning algorithms. In this setting out-of-core execution describes the processing of datasets that exceed GPU device memory. Either explicit data management is required or the memory migration unit on a GPU is used to allow out-of-core execution. Develop optimized approaches to improve existing basic out-of-core implementations.
  • Experimental analysis of parallel causal structure learning: There exists a multitude of different approaches and even different implementations for algorithms to learn the causal structures. Based on literature research on parallel causal structure learning approaches, derive and develop an experimental evaluation setup and compare available implementations, with a focus on constraint-based causal structure learning.
  • Evaluation of different methods for CSL within the mixed additive noise model: The omnipresence of heterogeneous data in real-world scenarios currently impedes the application of causal structure learning in practice. This is also due to the uncertainty of the functional relationships within heterogeneous data. As the mixed additive noise model provides the first step towards formalization, it remains to evaluate and compare the quality of different methods for causal structure learning in this setting.
  • Quantifying causal influences in a causal graphical model: While there exist many methods for causal structure learning from observational data, quantifying the causal strength of an edge within the causal graphical model remains a nontrivial question. Based upon recent research that formalizes causal strength based upon an information-theoretic perspective consider a measure within heterogeneous data.

Seminar Schedule

  1. Topics: During the first week of the lecture period, topics will be presented by the instructors and chosen by the participants. The topics can be worked on alone or in groups of two.
  2. Familiarization: The participants are expected to familiarize themselves with the chosen topic and study recent publications that are provided by the instructors.
  3. Midterm Presentations of approximately 20 minutes (15 min. presentation + 5 min. Q&A) will be held half-way through the lecture period.
  4. Project: Afterwards, implementations and evaluations will be conducted while participants receive guidance from the instructors.
  5. Final Presentations of approximately 20 minutes (15 min. presentation + 5 min. Q&A) will be held at the end of the lecture period.
  6. Scientific Report & Peer ReviewIn the end, a scientific report is written, including a peer-review phase among the participants. This report should define the targeted problem, document the selected approach including its implementation, and present a comprehensive evaluation, which is set into the context of related literature.

Preconditions

  • Good knowledge of statistical and mathematical methods
  • Former attendance of the Causal Inference – Theory and Applications in Enterprise Computing lecture or equivalent is beneficial
  • For selected topics knowledge of processing hardware is beneficial

Grading

  • 50 % Project result and presentation
  • 40% Scientific report
  • 10% Personal engagement