Hasso-Plattner-Institut
Prof. Dr. h.c. Hasso Plattner
  
 

Data-Driven Causal Inference

Probabilistic Machine Learning and Hardware Acceleration

We address open challenges in the context of causal inference by improvements in both the application of statistical and probabilistic concepts, and the GPU-based acceleration in order to enable an utilization in a real world context.

Motivation

The emergence of the Internet of Things (IoT) allows for a comprehensive analysis of industrial manufacturing processes. While domain experts within the company have enough expertise to identify the most common relationships, they will require support in the context of both, an increasing amount of observational data and the complexity of large systems of observed features. This gap can be closed by machine learning algorithms of causal inference that derive the underlying causal relationships between the observed features. 

Background

The questions that motivate most data analysis are not associational but causal in nature. What are the causes and what are the effects of events under observation? Nevertheless, the statistical methods commonly used today to answer those questions are of associational nature. But “Correlation does not imply causation!”, and misinterpretation often results in incorrect deduction. 

In the recent years, causality has grown from a nebulous concept into a mathematical theory. This is largely due to the work of Judea Pearl (2009) who has received the 2011 Turing Award for “fundamental contributions to artificial intelligence through the development of a calculus for probabilistic and causal reasoning”.

Schematic representation of the causal inference procedure

In this framework, causal relationships are modelled in a probabilistic graph that incorporates a finite set of nodes and edges representing the involved features and causal relationships, respectively. Algorithms for causal inference use conditional independence tests to receive information about underlying relationships. Building on this skeleton, the algorithms determine the orientation of the detected relationships to construct a causal graphical model.

 

 

 

Research Goals

Our research in the context of causal inference concentrates on several workstreams that combined aim to allow an efficient application in real-world scenarios.

Causal Inference Procedure:

  • Allow for more flexible conditional independence testing
  • Leverage existing knowledge of relationships

Hard- and Software Acceleration:

  • Application of Graphics Processing Units (GPU) to develop efficient implementations
  • Examination of integration into In-Memory Databases

Application of Causal Structural Knowledge:

  • Active learning of a functional system based on causal structural model
  • Evaluation of interventional simulation and optimization approaches

Use Cases

Together with different cooperation partners we examine how causal inference can be applied in a real-world context.

Manufacturing:

  • Identification of relevant involved factors in an industrial manufacturing process
  • Derivation of structural causal graphs in a complex production process

Medicine:

  • Application of causal inference in the context of gene expression data
  • Incorporation of genetic variants and gene expression data to detect causal relationships

Database Operations:

  • Examination of causal inference on the basis of system time series data
  • Derivation of insights in the context of an unexpected system behavior

News