Natural Language Processing

Intelligent data exploration tool to quickly extract relevant information from scientific publications

(Source: HPI)

Building on our natural language processing pipeline optimized for in-memory databases, we develop an intelligent data exploration tool for physicians and biomedical researchers to quickly extract relevant information from scientific publications. We perform ultra-fast text analysis on more than 26 million PubMed documents and integrate comprehensive terminologies from the Unified Medical Language System (UMLS) and Medical Subject Headings (MeSH). Our system integrates advanced text-analytical features such as question answering, i.e., providing answers to questions in natural language, and summarization, i.e., automatically building summaries for a selected collection of documents.

Our prototype provides an innovative way to explore the biomedical scientific literature via an intuitive touch user interface that allows users to ask questions to the system, easily navigate to answers, definitions and related publications, visualize MeSH concepts in the text and build summaries at any time for any set of documents. From the biomedical point of view, our approach has the potential to empower biomedical researchers and physicians alike to discover yet unknown relations between, e.g., diseases, drugs and symptoms.