Winter Semester 2014/15

15.10.2014 - Fall Retreat

no official weekly meeting

22.10.2014 - No Meeting

29.10.2014 - Future SOC Lab Day

Please listen to the talk Open Government Data Integration with Stratosphere at 14:30 or Performance Optimization of Data Mining Ensemble Algorithms on SAP HANA at 15:40.

Feel free to attend the entire Future SOC Lab Day.

05.11.2014 - Lena Herscheid

Formal Methods for Dependability 101: Impressions from MOD 2014 Summer School

This presentation summarizes experiences from the MOD 2014 summer school on dependable software systems engineering.
Several formal methods for assuring software dependability are briefly introduced. They are compared with regards to their usability, scalability and readiness for usage in complex software systems.

12.11.2014 - Stefan Klauck

Generic What-If Analyses Using an In-Memory Column Store

Companies measure their success based on key performance indicators, whose inter dependency can be represented by mathematical models, such as value driver trees. While such models have commonly agreed semantics, they lack the right tool support for business simulations, because a flexible implementation that supports multidimensional, hierarchical structures on basis of large data sets is complex and computationally challenging. My research addresses these problems and searches for a way to define and run generic what-if analyses to calculate how outputs of a mathematical model change based on varying inputs. This talk presents the concepts behind the approach, comprising a meta model to describe the dependencies of value drivers as a graph, a way to define the data binding for single nodes of the graph, and possibilities to specify and calculate simulation scenarios.

19.11.2014 - Cheng Wang

Learning Semantic Representation Using Multi-Level Online LDA for Text Categorization

This talk presents a novel approach for extracting semantics from multi-topic levels for text representation. Our approach extends Online LDA(OLDA) to multi-topic level and to learn a semantic space with different topic granularity. The effectiveness of our approach is evaluated on both large scale dataset Wikipedia and middle-sized dataset 20newsgroup. The former experiment shows that topics generated from different topic levels have different semantic coverage, which is more appropriate to represent text content. The classification on 20newsgroup shows the effectiveness of fusing word and semantic feature, which outperforms the state-of-the-art result in text categorization tasks. Fine tuning on feature fusion experiment shows that Support Vector Machine(SVM) is much more sensitive to semantic feature than Naive Bayes(NB), K Nearest Neighbor(KNN), Decision Tree(DT) and achieves the best classification result by using 4k word features and 0.4k semantic features with much less time consumption.

26.11.2014 - No weekly meeting

Alexander Albrecht's disputation will take place at 15:00 in H-2.57. Feel free to attend.

03.12.2014 - Alexandra Ion

Physical Motion Displays: Spatial Tactile Messages in Wearable Devices.

We propose communicating simple two-dimensional shapes/messages to users by dragging a physical tactor across their skin. The main benefit of this approach is that it produces not only a sense of touch, but also stretches the user’s skin, thereby reaching more skin receptors than the currently prevalent modality, i.e., vibrotactile. We appropriate this concept of moving a physical tactor across the user’s skin as a means for delivering tactile messages to the user. We present a prototype that implements the concept as a wearable device that users wear on their forearm. To demonstrate the potential of the concept for actual use, we show a second prototype that fits under a watch.

10.12.2014 - Dietmar Funck

Classification of 3D Point Clouds

3D point clouds are a representation of surfaces, describing them as a discrete set of points. There are different remote sensing techniques available for the creation of 3D point clouds (e.g., laser scanning or image matching). The structure and the available point attributes of the 3D point cloud might be influenced by the used remote sensing technique.

3D point clouds are used in several applications, such as urban and landscape planning, building reconstruction or documentation of sites or vegetation.

Usually, 3D point clouds are only used as input data for these applications to derive further information or representations, such as 3D models. To derive such data and for further analysis, applications often use only points belonging to a specific object class (e.g., ground, building, vegetation). However, semantic information, such as the object class of a point, is not available after the data capturing.

Therefore, classification approaches are used to derive semantic information based on the structure of the 3D point cloud and available point attributes. Usually, classification approaches make use of segmentation approaches to group points in homogeneous areas together.

The talk will present a classification approach which adapts to the average point density and available point attributes.

17.12.2014 - Tim Chen

Autocomplete Painting Repetitions (SIGGRAPH Asia)

Painting is a major form of content creation, offering unlimited control and freedom of expression. However, it can involve tedious manual repetitions, such as stippling large regions or hatching complex contours. Thus, a central goal in digital painting research is to automate tedious repetitions while allowing user control. Existing methods impose a sequential order, in which a small exemplar is prepared and then cloned through additional gestures. Such sequential mode may break the continuous, spontaneous flow of painting. Moreover, it is more suitable for homogeneous areas than nuanced variations common in real paintings.

We present an interactive digital painting system that auto-completes tedious repetitions while preserving nuanced variations and maintaining natural flows. Specifically, users paint as usual, while our system records and analyzes their workflows. When potential repetition is detected, our system predicts what the user might want to draw and offers auto-completes that adjust to the existing shape-color context. Our method eliminates the need for sequential creation-cloning and better adapts to the local painting contexts. Furthermore, users can choose to accept, ignore, or modify those predictions and thus maintain full control. Our method can be considered as the painting analogy of auto-completes in common typing and IDE systems. We demonstrate the quality and usability of our system through painting results and a pilot user study.

07.01.2015 - No weekly meeting

14.01.2015 - Shu Li

Shu Li: Traffic Congestion Control at Network Level with HANA

After the observation of the Macroscopic Fundamental Diagram (MFD) existence in urban area, various traffic control strategies are proposed to relieve the congestion at network level. The data from Vehicle Plate Recognition System (VPRS) which are adopted by increasing number of cities could be the data support for these strategies. This paper analyses data from the system with in-memory database – HANA to apply the network level control strategy. The key advantage of proposed approach is that real-time traffic control at network level can be implemented by integrating the HANA computation ability and data from VPRS.

21.01.2015 - Thijs Roumen

NotiRing, a comparative study of notification channels for wearable interactive rings

We conducted an empirical investigation of wearable interactive rings on the noticeability of five notification channels (light, vibration, sound, poke, thermal) during five levels of physical activity (laying down, sitting, standing, walking, and running). Results showed that vibration was the most reliable and fastest channel to convey notification, followed by poke and sound which shared similar noticeability. The noticeability of these three channels was not affected by the level of physical activity. The other two channels, light and thermal, were less noticeable and were affected by the level of physical activity. Our post-experimental survey indicates that while noticeability performance has a significant influence on user preference, each channel has its own unique advantages that make it suitable for different notification scenarios.

28.01.2015 - Nico Herzberg

Process Intelligence in Non-automated Process Environments

The execution of business processes generates a lot of data comprising final process results as well as information about intermediate activities, both communicated as events. Automated process execution environments are centrally controlled by process engines that hold the connection between events and the processes they occur in. In contrast, in manual process execution environments, e.g., healthcare and logistics, these events may not be correlated to the process they origin from. The correlation information is usually not present in the event but in so-called context data, which exists orthogonally to the corresponding process. However, in the areas of process monitoring and analysis, events need to be correlated to specific process instances. To close the gap between recorded events without process correlation and required events with process correlation, a framework is proposed that enriches recorded events with context data to create events correlated to processes, so-called process events. Further, the application of these process events for process monitoring and analysis is shown in several environments.

04.02.2015 - Ahmad Samiei, Thomas Baier

Ahmad Samiei: Incremental Record Deduplication

Databases play an important role in IT-based companies nowadays, and many industries and organizations rely on accuracy of data in databases to perform their operations. Unfortunately, the data are not always clean. For instance, real-world entities have multiple, different representations, which could be due to erroneous data entry, data evolution, data integration, etc. This in turn introduces so-called duplication into the databases. Deduplication intends to detect and eliminate different representations of real-world entities in a database. The focus of this work is on incremental deduplication, a more recent topic in deduplication. Deduplication is a time-intensive process and the sheer amount of data added to an already de-duplicated database makes it unreliable and unusable, therefore imposes extra cost to the industries. Incremental record deduplication attempts to address this problem and make database with many transactions always up-to-date and clean. That is, deduplication must happen on the fly, as the data arrives and enters the database.