Hasso-Plattner-Institut
Prof. Dr. h.c. Hasso Plattner
  
 

Christopher Schmidt

Research Assistant, PhD Candidate

Phone: +49 (331) 5509 - 1317
Fax: +49 (331) 97992 - 579
E-Mail:christopher.schmidt(at)hpi.de 
Room: Hasso-Plattner-Villa, V 2.01

Research

Parallel Execution Strategies for Causal Structure Learning on GPUs

Learning causal relationships from observational data is insightful for researchers in multiple domains. For example, in genetic research, gene regulatory networks can be derived from gene expression data. Real world gene expression datasets are often high-dimensional, resulting in long execution times prohibiting the application of constraint-based Causal Structure Learning (CSL) algorithms in practice.
As part of our research on Data-Driven Causal Inference we investigate the adaptation of existing CSL algorithms to utilize parallel processing capabilities of modern hardware in order to speed-up execution. This fosters the application of CSL in real-world settings, both in industry and in research. On multi-core CPUs, we introduce dynamic load-balancing for parallel execution of CSL algorithms to avoid situations of under- or overutilization of compute resources to effectively reduce runtimes. 
In recent years, GPUs have advanced to become a source for massive parallel processing. By utilizing the thousands of parallel processing units and shared memory in GPUs, we achieve a runtime improvement by factors of up to 700 for multivariate normal distributed data.
In a next step, we will generalize our GPU-accelerated algorithm to fit different data distributions. For this purpose, we develop a definition of tasks for parallel execution independent of the current data distribution. Such tasks support fine-grained parallelism, e.g., to speed-up processing of raw observational data during conditional independence tests.

Publications

  • Schmidt, C., Huegle, J., Bode, P., Uflacker, M.: Load-Balanced Parallel Constraint-Based Causal Structure Learning on Multi-Core Systems for High-Dimensional Data. SIGKDD Workshop on Causal Discovery. p. 59--77 (2019).
     
  • Schmidt, C., Uflacker, M.: Workload-Driven Data Placement for GPU-Accelerated Database Management Systems. Datenbanksysteme für Business, Technologie und Web BTW 2019, 18. Fachtagung des GI-Fachbereichs ,,Datenbanken und Informationssysteme" (DBIS), 4.-8. März 2019, Rostock, Germany, Workshopband (2019).
     
  • Schmidt, C., Huegle, J., Uflacker, M.: Order-independent constraint-based causal structure learning for gaussian distribution models using GPUs. SSDBM '18 Proceedings of the 30th International Conference on Scientific and Statistical Database Management. p. 19:1--19:10. ACM, New York, NY, USA (2018).
     
  • Schmidt, C., Dreseler, M., Akin, B., Roy, A.: A Case for Hardware-Supported Sub-Cache Line Accesses. Data Management on New Hardware (DaMoN), in conjunction with SIGMOD (2018).
     
  • Schwarz, C., Schmidt, C.: Interactive Product Cost Simulation on Coprocessors. HPI Future SOC Lab: Proceedings 2015. pp. 103-107 (2017).
     
  • Schwarz, C., Schmidt, C., Hopstock, M., Sinzig, W., Plattner, H.: Efficient Calculation and Simulation of Product Cost Leveraging In-Memory Technology and Coprocessors. The Sixth International Conference on Business Intelligence and Technology (BUSTECH 2016) (2016).