Summer Semester 2014

09.04.2014 - Stephan Müller

Dynamic Aggregates Caching for Enterprise Applications

Modern enterprise applications generate a mixed workload comprised of short-running transactional queries and long-running analytical queries containing expensive aggregations. Based on the fact that columnar in-memory databases are capable of handling these mixed workloads, we evaluate how existing materialized view maintenance strategies can accelerate the execution of aggregate queries. We contribute by introducing a novel materialized view maintenance approach that leverages the main-delta architecture of columnar in-memory databases, outperforming existing strategies for a wide range of workloads. As an optimization, we further propose an approach that adapts the aggregate maintenance strategy based upon the currently monitored workload characteristics.

16.04.2014 - Cape Town Workshop

No regular weekly meeting

23.04.2014 - (skipped)

30.04.2014 - Aragats Amirkhanyan

Introduction to the Research School

My name is Aragats Amirkhanyan. I was born on the 17th of February in 1989. I am from Russian Federation. My hometown is Saint-Petersburg. I studied at school #433 and at the Bonch-Bruevich Saint-Petersburg State University of Telecommunications. I have a work experience as Senior Java Developer in different IT companies. The last 2 companies where I worked are Speech Technology Center and T-Systems CIS. Since March 2014 I am a PhD student at HPI and plan to try myself in the research work.

07.05.2014 - Ekaterina Bazhenova

My Background

I am a PhD-student of Business Process Technology Chair (Prof. Weske) since March 2013. I would like to join HPI Research School for extending my professional network and exchanging experiences with HPI professors and other PhD-students.

Preliminary Topic of Thesis

Rethinking Decision Theory For Business Process Management

My main research focus

An efficient business process redesign represents a research and implementation challenge for both academia and industry. Traditional approaches for business process improvement are based on activity flows, not considering data of business processes. In my research I want to provide an approach to business process improvement, which is based on data and on combining data with decision theory. In particular, I plan to formalize and analyze decision activities of business processes according to techniques from decision theory.

14.05.2014 - Prof. Dr. Felix Naumann

How to review a paper

Open discussion, moderated by Prof. Dr. Felix Naumann.

28.05.2014 - Rami-Habib Eid-Sabbagh

Business Process Architecture: Concept and Analysis

In this talk I would like to introduce Business Process Architectures (BPAs), the topic of my Phd research.

Companies develop collections of hundreds of business process models that represent the complex system of cooperating entities that form an organization.

Designing and analyzing the structure of this system of business processes emerges as a new challenge, which is covered by the field of business process architecture.

Business process architectures provide a high level view on the business processes and their interdependencies.

This is especially important for examining the impact of restructuring and optimizing business processes.

After an introduction of our BPA approach I will present a technique based on open nets to analyze process interdependencies on BPA level at design time.

04.06.2014 - Thomas Brand

Advancing the evolution of complex systems provided by software ecosystems

The evolution of a software product is a sophisticated challenge. This is especially true if complex software systems based on this product are provided to many customers by a software ecosystem consisting of a major software vendor like SAP, consulting firms for customization and third party software extension vendors. First of all it takes experience as well as excellent relationships with the ecosystem partners and customers to take good evolution decisions. Additionally an evolution supported by software system self-adaptive capabilities might make this task significantly easier and less risky. During the presentation I would like to illustrate the challenge with the help of a practical example, give you more insights about the motivation for researching this subject and introduce you to some of the related questions.

11.06.2014 - Thomas Kowark

Simplifying Software Repository Analysis through Collective, Incremental Ontology Matching

Over the course of software development projects, software repositories accumulate a wealth of data that has the potential to provide decision support to practitioners. However, without knowing which coherences are worthwhile monitoring, the available data cannot be used to its potential. To this end, the software repository mining community researches these repositories and captures the gathered insights within publications and by creating new as well as improving existing analysis tools.

These tools and the underlying knowledge, however, are disconnected, meaning that changed or newly discovered metrics and models have to be made available for each analysis tool separately, e.g. by creating new plugins. My work presents an approach to overcome this issue by allowing to transfer queries on software repositories between different implementations. Due to the heterogeneity and constant change of the employed groupware tools, this is not a trivial task. Differences in data schemas and semantics need to be handled, i.e., ontologies have to be matched. While this task can be supported through automatic matching by a certain degree, a considerable amount of matching tasks requires manual user interaction.

The presented approach integrates these matching tasks into the process of query translation. Thus, users get direct feedback about the correctness of the generated alignment and the immediate benefit of obtaining answers to the questions that are reflected by the queries. We implemented this concept as part of a repository for patterns in groupware activity. This repository collectivizes the necessary translation efforts as each user contributes in the scope of their queries of interest. Furthermore, existing alignments are chained in order to further minimize the effort necessary to execute queries on as many repository implementations as possible. We evaluate our approach by showing how it simplifies the implementation of realistic use cases in comparison to existing, state-of-the-art analysis tools.

02.07.2014 - Arvid Heise

Data Cleansing and Integration Operators for a Parallel Data Analytics Framework

Real-world datasets are dirty, especially when Big Data analytics involve several, possibly user-curated datasets. In my thesis, I introduce a set of five concise data cleansing and integration operators, which increase the data quality of individual datasets and help to integrate them to larger datasets. I define the semantics of the operators, implement them on the general-purpose, parallel data analytics framework Stratosphere, and devise optimization techniques that allow holistic optimization of complex workflows. With the GovWILD case study, we use Stratosphere to integrate open government data in an efficient manner. In my talk, I focus on the duplicate detection operator and the estimation of the number and sizes of duplicate clusters.

16.07.2014 - Fahad Khalid

On Design Patterns, Dependences, and Tool Guided Parallelization for Hybrid Architectures

In previous work, I've shown how design patterns for parallel programming can be used to decide which computational kernels within an application should be executed on the CPU, and which kernels should be executed on the GPU. This is the foundation for the architecture-based algorithm decomposition approach.

In this talk, I'll show how dependence analysis (founded in Linear Algebra and Integer Linear Programming) can be used to extract design patterns for parallel programming from an existing code base. This approach is based on a novel mathematical description of design patterns in parallel programming. I'll further show how this method is expected to form the basis of tool guided parallelization for hybrid architectures.