1.
Borchert, F., Meister, L., Langer, T., Follmann, M., Arnrich, B., Schapranow, M.-P.: Controversial Trials First: Identifying Disagreement Between Clinical Guidelines and New Evidence. AMIA Annual Symposium Proceedings. bll. 237–246. American Medical Informatics Association (2021).
2.
Borchert, F., Mock, A., Tomczak, A., Hügel, J., Alkarkoukly, S., Knurr, A., Volckmar, A.-L., Stenzinger, A., Schirmacher, P., Debus, J., Jäger, D., Longerich, T., Fröhling, S., Eils, R., Bougatf, N., Sax, U., Schapranow, M.-P.: Knowledge bases and software support for variant interpretation in precision oncology. Briefings in Bioinformatics. 22, (2021).
Precision oncology is a rapidly evolving interdisciplinary medical specialty. Comprehensive cancer panels are becoming increasingly available at pathology departments worldwide, creating the urgent need for scalable cancer variant annotation and molecularly informed treatment recommendations. A wealth of mainly academia-driven knowledge bases calls for software tools supporting the multi-step diagnostic process. We derive a comprehensive list of knowledge bases relevant for variant interpretation by a review of existing literature followed by a survey among medical experts from university hospitals in Germany. In addition, we review cancer variant interpretation tools, which integrate multiple knowledge bases. We categorize the knowledge bases along the diagnostic process in precision oncology and analyze programmatic access options as well as the integration of knowledge bases into software tools. The most commonly used knowledge bases provide good programmatic access options and have been integrated into a range of software tools. For the wider set of knowledge bases, access options vary across different parts of the diagnostic process. Programmatic access is limited for information regarding clinical classifications of variants and for therapy recommendations. The main issue for databases used for biological classification of pathogenic variants and pathway context information is the lack of standardized interfaces. There is no single cancer variant interpretation tool that integrates all identified knowledge bases. Specialized tools are available and need to be further developed for different steps in the diagnostic process.
3.
Rasheed, A., Borchert, F., Kohlmeyer, L., Henkenjohann, R., Schapranow, M.-P.: A Comparison of Concept Embeddings for German Clinical Corpora. 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). bll. 2314–2321 (2021).
Clinical concept embeddings enable unsupervised learning of relationships among medical concepts. A range of benchmarks quantifies the degree to which learned representations capture medical semantics. However, training and evaluation of embeddings require a large amount of data. In addition, embeddings’ benchmark score varies in different languages because it differs with the size of the available corpora. Multi-modal data increases the corpus size, but data protection regulations limit access to clinical multi-modal data. We present an extendable pipeline for training clinical concept embeddings on various text corpora and evaluating the quality of trained embeddings on selected benchmark tasks. Our work provides different ways to identify clinical concepts in textual corpora. We train embeddings on selected German clinical text corpora and evaluate them on various benchmark scores. Our work can be extended to train embeddings in other languages in which a large multi-modal dataset is not available.