Prof. Dr. h.c. Hasso Plattner

Concepts of Modern Enterprise Applications

Industry trends such as smart devices (IoT), natural language processing, and machine learning on big data place high demand on modern enterprise systems, but also create tons of new opportunities. Our research targets effective methods and algorithms to extract business-relevant insights from raw data and is demonstrated by proof-of-concept prototypes built in close collaboration with our partners from different industries.

Data-driven Causal Inference

We address open challenges in the context of causal inference by improvements in both the application of statistical and probabilistic concepts, and the GPU-based acceleration in order to enable an utilization in a real world context. Read more.

Data-Driven Decision-Making

The need for automated decision-making is steadily increasing. The goal is to derive and to implement methods for data-driven decision support for practical applications in constantly changing environments. Solving such problems requires combining data management, data science, and optimization. In general, decision problems can be described by given performance criteria, admissible decisions, constraints, and data-driven estimations of the interplay of decisions on performance. Further, every application has its own specifics, which can be exploited to solve a problem effectively. In our research, we consider different use-cases and explore suitable optimization techniques. These problems fall into the areas of resource allocation management and operations management. In particular, we are interested in finding robust solutions for uncertain and changing environments.

Enterprise Stream Benchmarking

The ever increasing amount of data that is produced nowadays, from smart homes to smart factories, gives rise to completely new challenges and opportunities. Terms like "Internet of Things" (IoT) and "Big Data" have gained traction to describe the creation and analysis of these new data masses. New technologies were developed that are able to handle and analyze data streams, i.e., data arriving with high frequency and in large volume.In recent years, e.g., a lot of distributed Data Stream Processing Systems were developed, whose usage represents one way of analyzing data streams.

Although a broad variety of systems or system architectures is generally a good thing, the bigger the choice, the harder it is to choose. Benchmarking is a common and proven approach to identify the best system for a specific set of needs. However, currently, no satisfying benchmark for modern data stream processing architectures exists. Particularly when an enterprise context, i.e., where data streams have to be combined with historical and transactional data, existing benchmarks have shortcomings. The Enterprise Streaming Benchmark (ESB), which is to be developed, will attempt to tackle this issue. Read more.

Machine Learning for Sales Order Fulfillment

Large enterprises and their information systems produce and collect large amounts of data related to different areas, e.g., manufacturing, finance or human resources. This data can be used to complete tasks more efficiently, automate tasks that are currently executed manually, and also generate insights in order to solve certain challenges. Read more.

In-Memory Natural Language Processing

The current data deluge demands fast and real-time processing of large datasets to support various applications, also for textual data, such as scientific publications, Web pages or messages in the social media. Natural language processing (NLP) is the field of automatically processing textual documents and includes a variety of tasks such as tokenization (delimitation of words), part-of-speech tagging (assignment of syntactic categories to words), chunking (delimitation of phrases) and syntactic parsing (construction of syntactic tree for a sentence). Read more.

High-Performance In-Memory Genome (HIG) Project

The continuous progress in understanding relevant genomic basics, e.g. for treatment of cancer patients, collides with the tremendous amount of data, that need to be processed. For example, the human genome consists of approx. 3.2 billion base pairs resp. 3.2 GB of data. Identifying a concrete sequence of 20 base pairs within the genome takes hours to days if performed manually. Processing and analyzing genomic data is a challenge for medical and biological research that delays progress of research projects. From a software engineering point of view, improving the analysis of genomic data is both a concrete research and engineering challenge. Combining knowledge of in-memory technology and of how to perform real-time analysis of huge amount of data with concrete research questions of medical and biological experts is the aim of the HIG project. Read more.