Hasso-Plattner-Institut
Prof. Dr. Christoph Lippert
 

Statistical hypothesis testing in deep learning models

Overview:

We are developing methods for the statistical analysis of large biomedical data. In particular imaging provides a powerful means for measuring phenotypic information at scale. While images are abundantly available in large repositories such as the UK Biobank, the analysis of imaging data poses new challenges for statistical methods development.

 

References:

  • Kirchler, M., Khorasani, S., Kloft, M., & Lippert, C. (2020, June). Two-sample testing using deep learning. In International Conference on Artificial Intelligence and Statistics (pp. 1387-1398). PMLR. https://arxiv.org/abs/1910.06239
  • Kirchler M, Konigorski S, Norden M, Meltendorf C, Kloft M, Schurmann C, Lippert C (2021). transferGWAS: GWAS of images using deep transfer learning. bioRxiv 2021.10.22.465430. https://doi.org/10.1101/2021.10.22.465430.

 

Team:


Fast functionally informed kernel-based association tests

Overview:

In recent years, deep learning has enabled the accurate prediction of the function of DNA and RNA based on nucleotide sequence alone. In another line of research, kernel-based tests have been established as powerful association tests of rare genetic variants. Here, we combine these two streams of research and present a fast implementation of flexible set-based genetic association tests that include variant effects on intermediate molecular traits. This can be interpreted as testing for genetic effects that are mediated or moderated by intermediate molecular traits.

References:

  • Monti, R., Rautenstrauch, P., Ghanbari, M., James, A. R., Kirchler, M., Ohler, U., Konigorski S. & Lippert, C. (2022). Identifying interpretable gene-biomarker associations with functionally informed kernel-based tests in 190,000 exomes. Nature communications, 13(1), 1-16. https://doi.org/10.1038/s41467-022-32864-2
  • Konigorski S, Monti R, Rautenstrauch P, Lippert C (2020). Fast kernel-based rare-variant association tests integrating variant annotations from deep learning. In: The 2020 Annual Meeting of the International Genetic Epidemiology Society. Genetic Epidemiology 44(5): 495. https://doi.org/10.1002/gepi.22298.
  • Konigorski S, Monti R, Lippert C (2019). Kernel-based tests integrating variant effect predictions from deep learning for genetic association tests of rare variants. https://dx.doi.org/10.3205/19gmds067.
  • Konigorski S, Khorasani S, Lippert C (2018). Integrating omics and MRI data with kernel-based tests and CNNs to identify rare genetic markers for Alzheimer’s disease. In: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018, arXiv:1812.00448. https://arxiv.org/abs/1812.00448.
  • Konigorski S, Lippert C (2018). Kernel-based tests for very rare variants. In: The 2018 Annual Meeting of the International Genetic Epidemiology Society. Genetic Epidemiology 42(7): 711. https://doi.org/10.1002/gepi.22163.

 

Team:


International consortium for integrative genomics prediction (INTERVENE)

Overview

The aim of INTERVENE is to develop and test next generation tools for disease prevention, diagnosis, and personalised treatment utilizing the first US-European pool of genomic and health data and integrating longitudinal and disease-relevant - omics data into genetic risk scores. Resulting in unprecedented potential for prediction, diagnosis, and personalised treatments for complex and rare diseases. Some of the largest biobanks in Europe and two in the USA will be securely linked and harmonized in a GDPR-compliant repository with data from more than 1.7 million genomes. INTERVENE will demonstrate the potential and benefits of powerful AI technologies on the next generation of integrative genetic scores (IGS). The clinical and economic benefits of IGS will be evaluated in key disease areas with major public health burden. Here, the newly developed IGS will be taken into clinical environment and their real-world benefits will be evaluated together with clinical experts, European patients advocate groups and medical societies and considering regulatory and ethical implications. Thus, a framework for legally and ethically responsible translation into wider clinical practice will be developed. Moreover, the partners will develop and test the role of IGS in several rare diseases as well as COVID-19 infection and severity. Importantly, to support the application of IGS via public-private partnerships including clinical practitioners, an AIenabled federated data analysis platform, the ‘IGS4EU’ platform, will be developed for automated IGS generation and interpretation for end-users. Additionally, the IGS4EU platform will allow access of the INTERVENE data and the methodology know-how to the AI community through a competition-based benchmarking environment. In the long term, the IGS4EU platform aims to grow the disease coverage and enable a wide adoption of IGS as a gold standard in clinical research and practice.