Hasso-Plattner-Institut25 Jahre HPI
Hasso-Plattner-Institut25 Jahre HPI
  • de

19.10.2016 - no meeting

26.10.2016 - Francesco Quinzan

EAs in Real-World Optimization: Noise, Performance, Applications

When dealing with optimization of real-world numerical models, it is often unfeasible to perform correlation via direct methods. This is the case of the thermal analysis of bespoke electro-mechanical systems for extreme harsh environments, such as actuators and wireless sensors. Evolutionary Algorithms (EAs) are generic population-based optimization algorithms, useful to approach such problems.

Even though in the last decade theoretical results have been presented regarding the running time of EAs, there is still little understanding of the interplay between the operators of randomized search heuristics and explicit noise-handling techniques, such as statistical resampling. We report on several statistical models and theoretical results that help to clarify this reciprocal relationship for a collection of randomized search heuristics on noisy functions.

We consider the optimization of pseudo-Boolean functions under additive posterior Gaussian noise and explore the trade-off between noise reduction and the computational cost of resampling. We perform experiments to find the optimal parameters at a given noise intensity for a mutation-only evolutionary algorithm, a genetic algorithm employing recombination, an estimation of distribution algorithm (EDA), and an ant colony optimization algorithm. We observe how the optimal parameter depends on the noise intensity for the different algorithms. We locate the point where statistical resampling costs more than it is worth in terms of run time.

We find that the EA requires the highest number of resamples to obtain the best speed-up, whereas crossover reduces both the run time and the number of resamples required. We find that EDA-like algorithms require no resampling, and can handle noise implicitly. Interestingly, we observe that all tested algorithms exhibit high variance for increasing noise. We plan to combine the latter observation with rigorous analysis to define the optimal re-start strategy, with the hope of further improving performance.

02.11.2016 - Nanjing Feedback Round

09.11.2016 - Anton Krohmer

Structures & Algorithms in Hyperbolic Random Graphs

Complex networks like social networks, biological networks or computer networks are ubiquitous in nature and society. In this talk, we introduce a generative mathematical model to study such networks, called „Hyperbolic Random Graphs“. We theoretically analyze their structural properties, for instance the emergence of cliques and its diameter. Furthermore, we present an algorithm for embedding real-world networks into the hyperbolic plane such that connected nodes are nearby, and disconnected nodes are placed far apart. We perform an extensive experimental evaluation to ensure its quality, and also apply it to a real-world graph as a proof of concept.

16.11.2016 - Kateryna Kuksenok

When Good Intentions Meet Reality: Qualitative Study of Programming Practices

Programming is undertaken by people many backgrounds for many purposes. This talk focuses on qualitative methods for the study of changing programming practices. Based on ethnographic observation and interviews, we developed a theory of deliberate individual change. Under this theory, a shared imagination of a perfect world drives prioritization and decision-making. It is an unattainable moving target against which current action is evaluated. The tension between the imagined and actual reality is explored both through an ethnography of oceanographers, and more recently in mining code repositories and revision histories. 

23.11.2016 - Sankalita Mandal

Re-evaluation of Decisions based on Events

The effective and efficient design and execution of business processes is a key driver for organizations. Therefore, organization capture their business processes in form of process models, with the industry-standard BPMN (Business Process Model and Notation) serving as guidance for the implementation of information systems. Recently, the importance of decision in business processes and structurally capturing them has been revived by the new Decision Model and Notation (DMN) standard. Decisions have a direct influence on the quality and performance of business processes, because they determine the course of business processes. Usually, when a decision is taken, a specific path is chosen immediately based on that or the output is used by the process later on. However, in the digital world, organizations have the possibility to access an abundance of data in real-time. This offers the possibility to re-evaluate decision. . In the current work, a re-evaluation technique for decisions is presented by using event processing methods. If relevant event arrives within a certain time frame, then it may change previously taken decision to improve the process outcome.

30.11.2016 - Erik Scharwächter

Contextual Change Detection in Large Sensor Collections

The monitoring of urban, societal and environmental variables like air pollution, mobile phone usage, or tropical vegetation cover over time through various sensors allows to detect interesting, unusual changes relevant for decision makers. Recently, novel approaches for contextual time series change (CTC) detection have been proposed that define changes within a time series with respect to the behavior of other, similar time series. We address two shortcomings of the existing approaches using remote sensing imagery as an example use case: (1) sensors deliver high-dimensional time-series with context and changes hidden in subspaces, but existing work is limited to the univariate case; and (2) context construction is the major bottleneck of existing approaches and does not scale for large numbers of time series.

07.12.2016 - no meeting

14.12.2016 - Heng Gao

21.12.2016 - Christmas Break

28.12.2016 - Christmas Break

04.01.2017 - Martin Krejca

Escaping Local Optima with Diversity Mechanisms and Crossover.

Population diversity is essential for the effective use of any crossover operator. We compare seven commonly used diversity mechanisms and prove rigorous run time bounds for the (µ+1) GA using uniform crossover on the fitness function Jump_k. All previous results in this context only hold for unrealistically low crossover probability p_c = O(k/n), while we give analyzes for the setting of constant p_c < 1 in all but one case. Our bounds show a dependence on the problem size n, the jump length k, the population size µ, and the crossover probability p_c. This proves a sizable advantage of all variants of the (µ+1) GA compared to the (1+1) EA.

11.01.2017 - Ralf Rothenberger

The structure of industrial SAT instances

Propositional satisfiability (SAT) is one of the most fundamental problems in computer science. The worst-case hardness of SAT lies at the core of computational complexity theory. The average case of SAT has triggered the development of sophisticated rigorous and non-rigorous techniques for analyzing random structures.

Despite a long line of research and substantial progress, nearly all theoretical work on random SAT assumes a uniform distribution on the variables. In contrast, real-world instances often exhibit large fluctuations in variable occurrence. This can be modeled by a scale-free distribution of the variables, which results in distributions closer to industrial SAT instances.

We study random k-SAT on n variables, m = Θ(n) clauses, and a power-law distribution on the variable occurrences with exponent β. We observe a satisfiability threshold at β = (2k −1)/(k −1). This threshold is tight in the sense that instances with β ≤ (2k −1)/(k −1)−ε for any constant ε > 0 are unsatisfiable with high probability (w. h. p.). For β ≥ (2k−1)/(k−1)+ε the picture is reminiscent of the uniform case: instances are satisfiable w. h. p.  for sufficiently small constant clause-variable ratios m/n; they are unsatisfiable above a ratio m/n that depends on β.

18.01.2017 - Vladeta Stojanovic

Interactive Visualisation of Sensor Data for Building Lifecycle Management using BIM

The ability to capture and visualize BIM data is becoming increasingly important. This is especially the case when new or additional BIM documentation is required for existing buildings. Building managers can make use of as-is BIM data in order to create and enhance planning and operational documentation for a building throughout its lifetime. As-is BIM data can be created using point-cloud based data capturing and processing techniques. Methods for converting point cloud data to usable BIM geometry data and associated semantics pose a particular challenge. In addition to using as-is BIM data, the use of sensors in the built environment allows for efficient lifecycle management of a given building. Sensor data can be captured and visualized in real-time and provide additional insight into the operation of a building. The visualization of sensor data can be used to enhance the visual output of the associated BIM data in order to show not just the physical features of the building, but also the status of the internal operation of the building. The presentation of this combined BIM and sensor data allows for the generalized presentation of building operations within the “Real-Estate 4.0” realm. The use of service-based interactive visualization could be used to extend the representation of visualized data to less powerful mobile devices. This could be potentially beneficial for increasing stakeholder engagement amongst facility managers, by allowing complex visualization results to be transmitted to clients in real-time on portable mobile and computer devices. This research project will investigate and develop new methods and techniques for interactive visualization of “as-is” BIM and sensor data related to building lifecycle management. The outcomes of this project will contribute original knowledge to the field of BIM, sensor data visualization and point-cloud processing methods.

25.01.2017 - Jossekin Beilharz

Coordination Languages – From NUMA-Nodes to Cloud Federations

In contrast to applications in the domain of high performance computing and compute clusters, today's hierarchical, heterogeneous and dynamic distributed systems require a fresh assessment of the existing coordination systems.

This talk discusses the new challenges and identifies assumptions of existing distribution frameworks that don't apply to today's systems. Existing coordination frameworks are classified in three categories and their suitability for hierarchical, heterogeneous and dynamic distributed systems is assessed. Frameworks should be able to work on all levels of the hardware topology: From the smallest execution units – processor cores – to the highest level – cloud federations.

Based on identified best practices "Claud" is presented:

a compact and adaptable framework that provides coordination languages in the context of cloud computing. As locality of data and tasks is the decisive performance factor in cloud environments, Claud offers divers distribution modules to express the relationship between tasks and data on an abstract level. Instead of specifying the distribution explicitly, the distribution is handled by the framework implicitly. The basis of this implicit distribution decision can, however, be augmented and enhanced by the provisioning of additional information about the algorithm and its data.

The approach of Claud and the utilization of its distribution modules is evaluated using the Barnes Hut algorithm which demonstrates it's capacity for different distribution demands and the usage of varied additional information from the programmer.

01.02.2017 - Oliver Schneider

Software Tools for Vibrotactile Grids

Vibrotactile actuators, which stimulate the skin through vibration, are ubiquitous in mobile devices and smartwatches, and increasingly deployed in commercial products for entertainment, accessibility, and navigation. These more complex devices involve not just a single actuator, but several arranged into grids in chairs, vests, and belts. With careful sequencing, grids can render the illusion of complex motion paths, for example, sending a shiver down someone’s spine in a horror game, or a “turn left” signal for the driver of a car. To better support these devices, I have built a series of software tools that allow engineers and artists to easily create and customize vibrotactile effects, including motion, in real-time.

08.02.2017 - Julian Risch

Entropy-Based Topic Modeling on Multiple Domain-Specific Text Collections

Cross-collection topic modeling is substantial for the exploration, clustering, and comparison of large sets of text documents from different collections. Its field of application, comparative text mining, reaches from genre analysis and bias detection to the revelation of cultural and geographic differences or the search for prior art in patents and scientific papers. However, topic modeling on multiple collections is challenging because of domain-specific vocabulary.

This talk presents a novel topic model that combines cross-collection topic modeling with automatic domain term extraction. The model distinguishes domain-specific and domain-independent words based on information entropy and reveals commonalities and differences of multiple text collections.

We evaluate our model on patents, scientific papers, newspaper articles, forum posts, and Wikipedia pages. In comparison to state-of-the-art approaches, our model improves on topic coherence, perplexity, and classification accuracy. In addition, our model ensures disjunct general and specific word distributions, resulting in clear-cut topic representations.