Hasso-Plattner-InstitutSDG am HPI
Hasso-Plattner-InstitutDSG am HPI
  
Login
  • de
 

Interaction detection between genomic sequence motifs in convolutional neural networks

Marta Lemanczyk

Ph.D student at Data Analytics and Computational Statistics Group

Contact Information

Office: F-E.08
Tel.: +49 331 5509-4975
Email: Marta.Lemanczyk(at)hpi.de

Supervisor: Prof. Dr. Bernhard Renard
 

We investigate the application of interpretability and interaction detection methods of convolutional neural networks trained on genomic sequence data. Our goal is to find higher level interactions between relevant motifs in genomic sequences.

Introduction

Deep neural networks are capable of learning non-linear interactions between features which have an impact on the network's decisions. It is still challenging to explain the decisions due to the nature of the neural network: a black-box model. Specifically for medical applications, it is of great importance to understand the decisions made for sensitive tasks. One direct application in the biomedical field is deep learning on genomic sequences. This field became more accessible during the last years due to novel techniques in Next Generation Sequencing. Convolutional Neural Networks (CNN) are popular for this kind of tasks because of their ability to learn patterns in the input space. One way to find relevant patterns for a specific prediction task is to calculate contribution scores for single nucleotides which form biological significant motifs. However, it is often not enough to explain the outcome only with important motifs. Biological mechanisms can also contain complex interactions between those motifs. Our research focuses on how to identify such interactions in genomic sequences learned from CNNs.

    Research Topics

    Interpretability in Deep Learning

    The aim of interpretability methods is to find relevant features which influence the outcome of a prediction task by assigning a contribution score to each input feature. Most of these methods are either based on gradients [1] or backprogation of an attribution score [2,3]. In genomic sequences, it is possible to identify regions with higher contributions to a specific outcome with the help of such methods.
    We analyze how various interpretability methods behave when interactions between motifs are included in order to see if there are any differences in contribution scores between main effects of single motifs and interacting motifs. 

    Representation of genomic motifs in CNNs

    One step to detect interactions between motifs is to extract these motifs learned by a CNN. The model gets genomic sequences consisting of a string of nucleotides as an input. In the convolutional layers, spatial relationships between nucleotides are preserved so the CNNs can memorize relevant patterns. Depending on the networks architecture, motifs are learned directly in the first layer or distributed among deeper layers. An additional challenge is redundancy since multiple nodes can learn the same motif. There are multiple approaches to extract patterns from CNNs which often include the analysis of the model's filters, feature maps or importance scores calculated by interpretability methods. We will look at multiple settings for network architectures and evaluate how well motifs can be extracted.

    Interaction detection in neural networks

    Beside statistical methods for non-additive feature interaction detection [4,5], there also exist methods for neural networks [6,7,8]. However, these methods usually refer to interactions between features in the input space whereas we are interested in higher level features. One idea is to use the insights we gathered from the tasks above to define a feature space after the convolutional layers. Furthermore, we analyze how to modify interaction detection methods so that they are applicable on CNNs. 

    Current Work

    We analyze various interpretibility methods by using synthetic data with inserted interactions to see how these methods behave. Different archtitectures for our trained CNN models are included. Additionally, we visualize first layer filters to see which motifs where learned by the models.

    References

    1.  Sundararajan, Mukund, Ankur Taly, and Qiqi Yan. "Axiomatic attribution for deep networks." arXiv preprint arXiv:1703.01365 (2017). https://arxiv.org/abs/1703.01365
    2. Shrikumar, Avanti, Peyton Greenside, and Anshul Kundaje. "Learning important features through propagating activation differences." arXiv preprint arXiv:1704.02685 (2017). https://arxiv.org/abs/1704.02685
    3. Bach, Sebastian, et al. "On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation." PloS one 10.7 (2015): e0130140. https://doi.org/10.1371/journal.pone.0130140
    4. Bien, Jacob, Jonathan Taylor, and Robert Tibshirani. "A lasso for hierarchical interactions." Annals of statistics 41.3 (2013): 1111. https://dx.doi.org/10.1214%2F13-AOS1096
    5. Alin, Aylin, and S. Kurt. "Testing non-additivity (interaction) in two-way ANOVA tables with no replication." Statistical methods in medical research 15.1 (2006): 63-85. https://journals.sagepub.com/doi/abs/10.1191/0962280206sm426oa
    6. Janizek, Joseph D., Pascal Sturmfels, and Su-In Lee. "Explaining Explanations: Axiomatic Feature Interactions for Deep Networks." arXiv preprint arXiv:2002.04138 (2020). https://arxiv.org/abs/2002.04138
    7. Tsang, Michael, Dehua Cheng, and Yan Liu. "Detecting statistical interactions from neural network weights." arXiv preprint arXiv:1705.04977 (2017). https://arxiv.org/abs/1705.04977
    8. Tsang, Michael, et al. "Neural interaction transparency (nit): Disentangling learned interactions for improved interpretability." Advances in Neural Information Processing Systems. 2018. http://papers.nips.cc/paper/7822-neural-interaction-transparency-nit-disentangling-learned-interactions-for-improved-interpretability