Hasso-Plattner-Institut
Prof. Dr. Christoph Lippert
  
 

Statistical hypothesis testing in deep learning models

Overview:

We are developing methods for the statistical analysis of large biomedical data. In particular imaging provides a powerful means for measuring phenotypic information at scale. While images are abundantly available in large repositories such as the UK Biobank, the analysis of imaging data poses new challenges for statistical methods development.

 

References:

  • Kirchler, M., Khorasani, S., Kloft, M., & Lippert, C. (2020, June). Two-sample testing using deep learning. In International Conference on Artificial Intelligence and Statistics (pp. 1387-1398). PMLR. https://arxiv.org/abs/1910.06239
  • Kirchler, M., Konigorski, S., Schurmann, C., Norden, M., Meltendorf, C., Kloft, M., Lippert, C. transferGWAS: GWAS of images using deep transfer learning. Manuscript in preparation.

 

Team:


Fast kernel-based genome-wide association tests

Overview:

In recent years, deep learning has enabled the accurate prediction of the function of DNA- and RNA-sequences based on their nucleotide sequences alone. In another line of research, kernel-based tests have been established as powerful association tests of rare genetic variants. Here, we combine these two streams of research and present seak(sequence annotations in kernel-based tests): a fast implementation of flexible set-based genetic association tests that include variant effects on intermediate molecular traits, correcting for family and population structure. This can be interpreted as testing for genetic effects that are mediated or moderated by intermediate molecular traits.

 

References:

  • Konigorski S, Monti R, Rautenstrauch P, Lippert C (2020). Fast kernel-based rare-variant association tests integrating variant annotations from deep learning. In: The 2020 Annual Meeting of the International Genetic Epidemiology Society. Genetic Epidemiology 44(5): 495. https://doi.org/10.1002/gepi.22298.
  • Konigorski S, Monti R, Lippert C (2019). Kernel-based tests integrating variant effect predictions from deep learning for genetic association tests of rare variants. https://dx.doi.org/10.3205/19gmds067.
  • Konigorski S, Khorasani S, Lippert C (2018). Integrating omics and MRI data with kernel-based tests and CNNs to identify rare genetic markers for Alzheimer’s disease. In: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018, arXiv:1812.00448. https://arxiv.org/abs/1812.00448.
  • Konigorski S, Lippert C (2018). Kernel-based tests for very rare variants. In: The 2018 Annual Meeting of the International Genetic Epidemiology Society. Genetic Epidemiology 42(7): 711. https://doi.org/10.1002/gepi.22163.

 

Team:


International consortium for integrative genomics prediction (INTERVENE)

Overview

The aim of INTERVENE is to develop and test next generation tools for disease prevention, diagnosis, and personalised treatment utilizing the first US-European pool of genomic and health data and integrating longitudinal and disease-relevant - omics data into genetic risk scores. Resulting in unprecedented potential for prediction, diagnosis, and personalised treatments for complex and rare diseases. Some of the largest biobanks in Europe and two in the USA will be securely linked and harmonized in a GDPR-compliant repository with data from more than 1.7 million genomes. INTERVENE will demonstrate the potential and benefits of powerful AI technologies on the next generation of integrative genetic scores (IGS). The clinical and economic benefits of IGS will be evaluated in key disease areas with major public health burden. Here, the newly developed IGS will be taken into clinical environment and their real-world benefits will be evaluated together with clinical experts, European patients advocate groups and medical societies and considering regulatory and ethical implications. Thus, a framework for legally and ethically responsible translation into wider clinical practice will be developed. Moreover, the partners will develop and test the role of IGS in several rare diseases as well as COVID-19 infection and severity. Importantly, to support the application of IGS via public-private partnerships including clinical practitioners, an AIenabled federated data analysis platform, the ‘IGS4EU’ platform, will be developed for automated IGS generation and interpretation for end-users. Additionally, the IGS4EU platform will allow access of the INTERVENE data and the methodology know-how to the AI community through a competition-based benchmarking environment. In the long term, the IGS4EU platform aims to grow the disease coverage and enable a wide adoption of IGS as a gold standard in clinical research and practice.