Hasso-Plattner-Institut
Prof. Dr. Tobias Friedrich
 

Research Seminar (Summer Term 2022)

<previous seminar | next seminar>

COVID-19 and virtual seminars

Due to the current situation we are not hosting seminars at the HPI anymore. Instead, we host virtual seminars via zoom. If you want to join the seminar on short notice or if you do not want to subscribe to the mailing list, please send an email to Katrin.Casel(at)hpi.de or to Andreas.Goebel(at)hpi.de.

Description

A series of talks on current research and interesting topics in algorithm engineering and theoretical computer science. Everyone is welcome to attend talks. 

If you want to give a talk about a topic in algorithm engineering or theoretical computer science, please write an e-mail to Dr. Katrin Casel or to Dr. Andreas Göbel.

Announcements

For announcements of talks, subscribe to our mailing list.

Talks (click on title to show abstract)

DateRoom SpeakerTitle
01.04. 11:00virtual Janosch Ruff

The talk is about a constructive proof of the Lovász local lemma. A beautiful result on its own, Robin A. Moser presented a remarkable simple algorithm to construct a combinatorial object with properties known to exist thanks to the probabilistic method. The main focus of the talk is an elegant proof technique that tries to compress the incompressible by a Kolmogorov Complexity argument giving an efficient algorithm as a byproduct. This one is definitely from "The Book". You don't have to believe in God, but you should believe in "The Book".

08.04. 11:00virtual Bachelor-Project TomTom 22/21

In this talk we will present theoretical and practical results of our Bachelor Project in the field of (electric vehicle) routing. Therefore we split our talk in two parts: In the first part we will talk about the routing speedup technique Arc-Flags. In particular we will present our approaches and ideas on explaining this algorithms efficiency using the graph parameter Skeleton Dimension. In the second part we will then visually compare electric vehicle routing to conventional routing and highlight the differences regarding travel time, costs and detours based on real world data.

22.04. 11:00virtual Bachelor-Project Valyria 22/21

In our bachelors project, we come up with machine learning models to predict the price of real estate objects. In particular, we are aiming to employ transfer learning to improve the prediction quality on domains where only little training data is available. In our presentation, we will explore four concepts that we have been working with over the past semester. We will cover some simple transfer learning ideas for Deep Neural Networks and, in contrast, two techniques to improve DNN performance in a non-transfer context. Following that, we will present two other machine learning algorithms, k-nearest-neighbour and clustering, and how we adapt them to a transfer learning setting.

16.05. 15:00Belvedere (hybrid)Aneta Neumann
Frank Neumann

In many real-world scenarios, it is necessary to handle uncertainty, and potentially disruptive events that violate constraints in stochastic settings need to be avoided. A lot of evolutionary multi-objective algorithms have recently been analyzed and applied to submodular problems with different types of constraints. This talk will provide you insights on the impact of uncertainty in advanced mine optimisation on the real-word stochastic optimisation problem which allow one to mitigate the uncertainty in the deposit while maintaining high profitability. We will also show the behavior of evolutionary multi-objective algorithms on different submodular chance constrained network problems.

Joint work with Benjamin Doerr, Carola Doerr, Frank Neumann, Simon Ratcliffe, Will Reid, Michael Stimson, Andrew M. Sutton.


Diversity plays a crucial role in the area of evolutionary computation. Traditionally, diversity has been used to enable successful crossover operations and prevent premature convergence. In recent years, computing sets of solutions that have structural or behavioural diversity has gained significant attention under the terms evolutionary diversity optimization and quality diversity. This talk will give an overview on this area of research and point out some recent results. We will cover results in the area of evolutionary diversity optimization for the classical traveling salesperson problem and show how quality diversity approaches can be used to achieve better solutions for the traveling thief problem.

Joint work with Viet Anh Do, Mingyu Guo, Aneta Neumann, Adel Nikfarjam

20.05. 11:00virtual Elazar Goldberg

In many applications one wants to measure distances between a given pair of strings. The simplest way, called Hamming distance, is scanning the strings and comparing the number of characters by which they differ. However, in many cases this is not very useful. Instead, a very common measurement is called edit distance, which counts the minimal number of edit operations needed for converting the first string into the second. Where the set of edit operations consists of character substitution, insertion and deletion. While the edit distance is more applicable, computationally, it seems to be more challenging. A possible method to bridge between the measurements is called embedding: This is a mapping that preserves distances. In this talk I'll show an embedding protocol. The correctness analysis follows by an argument concerning a drunk walk. No prior knowledge is needed. The talk is based on a joint work with Michal Koucky and Diptarka Chakraborty.

27.05. 11:00virtual Stefan Neubert

When we talk about complexity in TCS, we usually refer to the time complexity of algorithms and problems in a rather simple model. We assume a single processing unit that has access to a lot of fast memory. In practice, this is mostly not the case: huge amounts of data are spread out over distributed systems (or at least separated components of a single system) and must be combined to compute some function on that data. For such systems – but also to analyze streaming algorithms, data structures, VLSI circuits and more – it is interesting to analyze the amount of communication that has to take place to compute that function. In winter semester 2021 I gave a lecture on the basics of Communication Complexity. This talk gives an overview on the model and some interesting results from the field.

03.06. 11:00virtual Ben Bals

Human lives are increasingly influenced by algorithms, which therefore need to meet higher standards not only in accuracy but also with respect to explainability. This is especially true for high-stakes areas such as real estate valuation. Unfortunately, the methods applied there often exhibit a trade-off between accuracy and explainability. One explainable approach is case-based reasoning (CBR), where each decision is supported by specific previous cases. However, such methods can be wanting in accuracy. The unexplainable machine learning approaches are often observed to provide higher accuracy but are not scrutable in their decision-making.

In this paper, we apply evolutionary algorithms (EAs) to CBR predictors in order to improve their performance. In particular, we deploy EAs to the similarity functions (used in CBR to find comparable cases), which are fitted to the data set at hand. As a consequence, we achieve higher accuracy than state-of-the-art deep neural networks (DNNs), while keeping interpretability and explainability.

These results stem from our empirical evaluation on a large data set of real estate offers where we compare known similarity functions, their EA-improved counterparts, and DNNs. Surprisingly, DNNs are only on par with standard CBR techniques. However, using EA-learned similarity functions does yield an improved performance.

10.06. 11:00virtualLeon Schiller,
Simon Wietheger
Graph embeddings are a popular tool in Network Science and Machine Learning that helps understand the structure of complex networks. For this reason, embeddings have various practical applications such as link prediction, community detection, or visualization. We want to focus on embedding techniques that are in accordance with established generative network models as these models reliably explain the most important properties of real-world networks. One such model, that is likewise used for embeddings, are Hyperbolic Random Graphs (HRGs). However, many existing embedders for HRGs compute embeddings only in two-dimensional hyperbolic space which is in contrast to many alternative embedding techniques from Machine Learning where very high dimensional spaces are used in practice. A recent generalization of HRGs are Geometric Inhomogeneous Random Graphs (GIRGs) which allow for adjusting the dimensionality of the ground space while being technically simpler than HRGs but similarly realistic. We investigate the suitability of this model for embedding networks by generalizing embedding techniques of HRGs to this model, whereby we pay special attention to the influence of the dimensionality. Our goal is to answer the question if high-dimensional spaces are actually necessary or if a similar performance can be achieved in a low-dimensional space. Besides conducting such practical experiments, we analyze the geometric properties of the GIRG ground space and study the properties of GIRGs as a function of their dimensionality which is often assumed to be constant in the literature. Our preliminary results show that the GIRG model has geometric properties very similar to that of hyperbolic space. This suggests that it is capable of embedding hierarchical structures already in low-dimensional spaces. The goal of the thesis is to find out what GIRG-dimensionality can be considered realistic for real-world networks and to make theoretical contributions regarding the properties of GIRGs based on their dimensionality.

Unweighted Correlation Clustering is a clustering problem defined on complete graphs, where every edge is labeled either positively or negatively. The goal is to partition the vertices such that the number of negative edges inside each cluster plus the number of positive edges between different clusters is minimized. While Correlation Clustering is well established, we focus on the rather young area of Fair Correlation Clustering. There, each vertex of the graph is labeled with some color. A partition is called fair if in each of its sets the color ratio is the same as in the whole graph. Fair Correlation Clustering asks for a fair partition that minimizes the correlation clustering cost. With Correlation Clustering itself being NP-hard and the fairness constraints putting additional challenges, we start our examinations on instances where the positively labeled edges form a tree, which turns out to be already very challenging.
24.06. 11:00virtual Vanja Doskoč,
Julian Berger

We study classes of formal languages learned by computable learners imposed with certain restrictions. In particular, we compare, for each considered restriction, the set of learnable classes with each other. When requiring the learners to eventually provide a single, correct explanation of the target language (explanatory learning), the literature already provides plenty completed comparisons, depicted in lucid maps. However, when expecting the learners to provide a semantically correct, but possibly syntactically changing description of the target language (behaviourally correct learning), only partial results are known. We complete these results and provide full behaviourally correct maps for different types of data presentation. This is joint work with Timo Kötzing.


Network Creation Games are a popular tool to study the emergence of networks created by selfish agents, such as the Internet. As navigation is an important function of many networks and information about the global network structure is not always available locally, we study Network Creation Games that enforce greedy routing. We examine different variants of that model and analyze the existence of equilibria, computational hardness, as well as other game theoretic properties.
29.06. 16:00virtual Hans Gawendowicz,
Martin Schirneck

During a pandemic people have to find a trade-off between meeting others and staying safely at home. While meeting others is pleasant, it also increases the risk of infection. We consider this dilemma by introducing a game-theoretic network creation model in which selfish agents can form bilateral connections. They benefit from network neighbors, but at the same time, they want to maximize their distance to all other agents. This models the inherent conflict that social distancing rules impose on the behavior of selfish agents in a social network. Besides addressing this familiar issue, our model can be seen as the inverse to the well-studied Network Creation Game by Fabrikant et al.~[PODC 2003] where agents aim at being as central as possible in the created network. Thus, our work is in-line with studies that compare minimization problems with their maximization versions. We look at two variants of network creation governed by social distancing. In the first variant, there are no restrictions on the connections being formed. We characterize optimal and equilibrium networks, and we derive asymptotically tight bounds on the Price of Anarchy and Price of Stability. The second variant is the model's generalization that allows restrictions on the connections that can be formed. As our main result, we prove that Swap-Maximal Routing-Cost Spanning Trees, an efficiently computable weaker variant of Maximum Routing-Cost Spanning Trees, actually resemble equilibria for a significant range of the parameter space. Moreover, we give almost tight bounds on the Price of Anarchy and Price of Stability. These results imply that, compared the well-studied inverse models, under social distancing the agents' selfish behavior has a significantly stronger impact on the quality of the equilibria, i.e., allowing socially much worse stable states. (Joint work with Tobias Friedrich, Pascal Lenzner, and Anna Melnichenko)


An edge fault-tolerant diameter oracle (FDO) is a data structure that preprocesses a graph \(G\) and, when queried with an edge \(e\), returns the (approximate) diameter of the graph \(G-e\) in which that edge failed. The problem of designing FDOs was originally raised by Henzinger et al. [ITCS 2017] and recently received some renewed interest by Bilò et al. [MFCS 2021]. We continue this study with a special focus on space efficienty. Bilò et al. showed that for every positive integer \(m\) there is a graph \(G\) with \(m\) edges such that every FDO for \(G\) with a stretch better than \(3/2\) requires \(\Omega(m)\) bits of space. The original bound was proven with a low-diameter graph and we show that this assumption is essential. For graphs with a large diameter of \(\omega(n^{5/6})\), we give a \((1+o(1))\)-approximate FDO taking \(\widetilde{O}(n)\) space, braking the \(\Omega(m)\) barrier. This raises the question of the smallest diameter that allows for \(o(m)\)-space oracles. We narrow the gap by first extending the \(\Omega(m)\)-bound up to diameter \(D = O(n/\sqrt{m})\) and stretch better than \(3/2 - 1/D\). Conversely, we show that for diameter \(\omega(n^{4/3} \log n/(\varepsilon \sqrt{m}))\), \((1+\varepsilon)\)-approximate FDOs with \(o(m)\) space are possible. Initially, our oracle construction are randomized relying on a sampling technique that is frequently used in fault tolerance. We develop a new framework for efficient derandomization. We show its versatility by not only applying it to our own data structures but also to results from the literature. In particular, we derandomize the distance sensitivity oracle (DSO) by Ren [JCSS 2022] to obtain the first deterministic DSO with subcubic preprocessing time. We also derandomize the Single-Source Shortest Paths algorithm by Chechik and Magen [ICALP 2020]. This is joint work with Davide Bilò, Keerti Choudhary, Sarel Cohen, and Tobias Friedrich.