Summer Semester 2019

10.04.2019 - Symposium on Future Trends in Service-Oriented Computing

17.04.2019 - Michael Meinig

Rough Logs - A Data Reduction Approach for Log Files

Modern scalable information systems produce a constant stream of log records to describe their activities and current state. This data is increasingly used for online anomaly analysis, so that dependability problems such as security incidents can be detected while the system is running. Due to the constant scaling of many such systems, the amount of processed log data is a significant aspect to be considered in the choice of any anomaly detection approach. We therefore present a new idea for log data reduction called ‘rough logs’. It utilizes rough set theory for reducing the number of attributes being collected in log data for representing events in the system. We tested the approach in a large case study - the experiments showed that data reduction possibilities proposed by our approach remain valid even when the log information is modified due to anomalies happening in the system.

24.04.2019 - Sebastian Marwecki

Stuff-Haptics: Passive Haptics Experiences Generated From Arbitrary Sets of Physical Props

Current passive haptics experiences only run with a premeditated set of physical props, preventing such experiences from running anywhere else. We present Stuff-Haptics, a software system that allows passive haptics experiences to run on different sets of props, such as physical objects found in the home. Stuff-Haptics accomplishes this by allowing experience designers to define the virtual objects in their experience using a generic format. This allows the system to run the experience in a wide range of locations by procedurally modelling virtual object sets to match the available props.

01.05.2019 - no meeting (Tag der Arbeit)

08.05.2019 - Prof. Dr. Felix Naumann

Open, Moderated Discussion on Reviewing Scientific Work

Elements of a review
Single-, Double-, and Triple-blind reviewing
How to write reviews?
What makes a good review?
How to write for reviewers?
How to rebut a review and when to complain.
Good and bad experiences with reviews

15.05.2019 - Felicia Burtscher

On Waddington landscapes, a prediction tool for the Malaria community and RootSkel — A presentation of past work and plans for future work.

In this presentation I am going to present 3 very different projects during my Master's studies. Waddington or epigenetic landscapes have been used as a conceptual model for cell differentiation in biology since the 40s. In this project, we aimed to produce a software package in Julia with various features to generate and analyse the landscape for the user. Computing the host-pathogen mapping ratio to predict the read depth of dual-RNA seq data is a challenge Malaria researchers often face. This interactive website for the Malaria community helps to predict this ratio based on different information on the sample. RootSkel is a standalone software package developed for plant morphology researchers analysing the phenomenon of electrotropism, the response of a plant to an electric field. More explicitely, the software computes the angle of the curved root tip with minimal user input and thus, standardises and automates the former manual angle computation.

22.05.2019 - Christoph Matthies

Agile Software Process Improvement (in Retrospectives)

Working in iterations and repeatedly improving team workflows based on collected feedback is fundamental to agile software development processes. Scrum, the most popular agile method, provides dedicated retrospective meetings to reflect on the last development iteration and to decide on process improvement actions. However, agile methods do not prescribe how these improvement actions should be identified, managed or tracked in detail. The approaches to detect and remove problems in software development processes are therefore often based on intuition and prior experiences and perceptions of team members. Previous research in this area has focused on approaches to elicit a team's improvement opportunities as well as measurements regarding the work performed in an iteration, e.g. Scrum burn-down charts. Little research deals with the quality and nature of identified problems or how progress towards removing issues is measured. In this research, we investigate how agile development teams in the professional software industry organize their feedback and process improvement approaches. In particular, we focus on the structure and content of improvement and reflection meetings, i.e. retrospectives, and their outcomes. Researching how the vital mechanism of process improvement is implemented in practice in modern software development leads to a more complete picture of agile process improvement.

29.05.2019 - Stefan Neubert

Complexity lower bounds with Fine-Grained Reductions

How fast can you solve a given algorithmic problem? We can approach this fundamental question from two different directions. By developing new algorithms that outperform existing solutions, one shows upper bounds on the problem’s complexity. Lower bounds on the other hand rule out the existence of faster algorithms. They are, however, usually based on conjectured lower bounds for other problems; most prominently the P≠NP conjecture. After a short overview of my previous work at HPI, the talk will introduce the two main concepts of my current research, namely enumeration problems and Fine-Grained Reductions: Instead of only deciding a problem or producing one solution to it, in enumeration we ask for an efficient output of all solutions. This requires a different conception of complexity than, for example, a standard decision or search problem. Reducing problems to one another helps to cluster problems into classes of similar complexity. Recently, the notion fine-grained was introduced to denote very carefully designed reductions that allow transferring specific polynomial complexity lower bounds.

05.06.2019 - Gerardo Vitagliano

Table recognition in Multitable Spreadsheets

Data comes in various shapes, formats and flavours: this is the reason why data practitioners spend much of their time to prepare it before using it in a given downstream application. Spreadsheet files are amongst the most common formats to store and analyze data. These are generally used as "whiteboards", freely laying out data for human inspection and usage. However, these arbitrary arrangements of data make automated ingestion or preparation difficult. In fact, one of the major challenges encountered is the presence, in a single spreadsheet, of more than one source of information, i.e. tables. The talk will introduce the problem of recognizing tables in multitable files, motivate it in the context of automated data preparation, and outline our proposed solution. We exploit the inherent visual information that lies in human-produced spread- sheets, by rendering them as images. Then, a clustering approach is used to group blocks of pixels found in the images and recognize distinct tables.

12.06.2019 - Andreas Grapentin

InstantLab - Running Virtualised Heterogeneity Experiments in the Cloud

InstantLab has been conceived as a platform for on-demand operating system experiments to be used for teaching and research. Predefined system images can be instantiated on a number of different, individually managed cloud resources, and are presented in a consistent and abstract web-based interface to the user. In this talk, I will briefly outline the history and architecture of InstantLab and the current state of the project, and present plans to extend the functionality of the platform with a heterogeneous virtualisation adapter, to allow the evaluation of workloads in emulated heterogeneous systems.

19.06.2019 - Vanja Doskoc

Introduction to Algorithmic Learning Theory

2, 4, 8, 16, 32,... What is the next element in the sequence? What set are we seeing here? How do we know that we are correct? As a first step to tackle these questions, one would have to clearify when a set is identified. While doing this, we may also ask whether we want the explanation to be semantically, or only syntactically? Do we allow mistakes in our final conjecture? After we clearify this, we may also add additional restrictions, for example on the information available. What happens if the memory is restricted? What happens if we do not have the full information? Also, one can question the suggestions itselves. Should the new suggestion be in line with what we have seen so far? Should we change mind even though the new element fits into the last conjecture? When should we throw an idea over board? These and more questions are investigated within Algorithmic Learning Theory. We use rigorous analysis and model many different learning paradigms, coming from nature or psychology. Then, we compare those learning paradigms and see how powerful they are. In this talk, I will present an overview of the topic itself and give you a feeling for what I do.

26.06.2019 - Julian Risch

Analyzing Reader Comments on News Articles

Comment sections on online news platforms are an essential space to express opinions and discuss various topics. These platforms face enormous challenges because of the overwhelming and ever-increasing number of received comments. The overload of information not only renders content moderation infeasible, but also hinders users from engaging in discussions. No dialogs emerge. To this end, we investigate the research question: "How can we foster engaging, respectful, and informative online discussions?". This talk focuses on our most recent progress in a project on comment ranking, where we train deep neural networks to improve on the chronological order of comments. Further, we present efforts to support the repeatability of our research and define the concept of "data repeatability”.

03.07.2019 - Francesco Quinzan

Maximizing Submodular Functions under Matroid and Knapsack Constraints

Submodular functions capture the notion of diminishing returns, i.e. the more we acquire the less our marginal gain will be. This notion occurs frequently in the real world, thus, the problem of maximizing a submodular function finds applicability in many scenarios. Examples of such scenarios include: maximum cut problems, combinatorial auctions, facility location, problems in machine learning, coverage functions, online shopping. As such, the literature on submodular functions contains a vast number of results spanning over three decades. Often in these applications, a realistic solution is subject to some constraints. Among the most common constraints are matroid and knapsack constraints. Common examples of such constraints are uniform matroids, also known as cardinality constraints, and partition matroids. In this talk, I will introduce the notion of submodular function and of Matroid and Knapsack constraints. I will then discuss how to approach these problems, and conclude with an overview of a real-world application.

10.07.2019 - Shohei Katakura & Muhammad Abdullah

A 3D Printer Head as a Robotic Manipulator

Three-dimensional (3D) printers, which can print out 3D geometric data in the physical world, are becoming affordable, especially fused deposition modeling (FDM) 3D printers. We believe that 3D printers have potential beyond printing usage, because FDM 3D printers are intrinsically three-axis Cartesian coordinate robots, which have a filament extruder. The motivation behind our research is to redesign 3D printers, taking advantage of the three-axis robot features and the ability to generate physical objects. We introduce new ways of using 3D printer head as a 3-axis robotic manipulator to enable advanced fabrication and usage such as breaking support materials, assembling separately printed parts and actuating printed objects on a build-plate. To achieve these manipulations, we customize a low-cost commodity fused deposition modeling (FDM) 3D printer to that can attach/detach printed end-effectors which change the function of the 3D printer head (e.g. hook, break, and rotate printed objects). By these advanced fabrication techniques, a low-cost FDM 3D printer print out kinetic objects one-off such as bevel gears, springs. In addition, this technique enables actuating printed functional objects on a build-plate that need a power source and actuators such as a coffee mill.

Building a drone-based Haptic Device

Immersion is a key aspect in creating realistic virtual worlds. After significant advances in the visual domain, it appears that realistic haptic effects are the next step towards increasing immersion. A force-reflecting haptic interface generates synthetic force using mechanical actuators and delivers it to a user through physical contact or coupling with a user’s body. In order to convey the generated force to a certain body part, a haptic interface should be affixed elsewhere. Generally haptic devices are tethered to a surface. In this case the work-space is fixed on the ground and restricted in size by the mechanical limits of the interface. On the other hand if the interface is tethered to the user’s body, exploiting a body part as a reaction support only "relative-force" among body parts can be generated. A drone based haptic device ideally overcomes these usability issues. A drone can actively generate kinetic energy and can push and pull a user’s hand without the need to be tethered. If this physical interaction is done in a well-controlled manner, it can be used as a free moving force reflecting haptic interface. As a proof-of-concept study this research focuses on creating haptic feedback only in 1D direction. To this end, an encountered-type, safe and un-tethered haptic display is implemented. An overview of the system and details on how to control the force output of drones is provided. Our current prototype generates forces up to 1.53 N upwards and 2.97N downwards. This concept serves as a first step towards introducing drones as mainstream haptic devices.

17.07.2019 - Mazhar Hameed

Data Preparation in a Nutshell

This talk is designed to provide insight into my ongoing research in the field of data preparation. The talk will begin with some background details and some insights into my research at NUST KDRC related to Adaptive Systems for Business Intelligence. Furthermore, I will focus on my initial research plans in the area of data preparation in the Information Systems group. Things to look forward to

The talk will highlight the usability of some exciting tools that offer data preparation techniques, such as SAP Agile data preparation, Trifacta, and Talend.
Discussion on various features of data preparation such as deduplication, field format normalization, locating and fixing bad quality data, content matching using regular expressions and smart cleaning. Along with a vision of how different preparators can resolve unattended use cases.
Initial background on the need for more research-intensive tools and how we plan to add value to the current research and development in data preparation.