Prof. Dr. Tobias Friedrich

# Research Seminar (Summer Term 2022)

## COVID-19 and virtual seminars

Due to the current situation we are not hosting seminars at the HPI anymore. Instead, we host virtual seminars via zoom. If you want to join the seminar on short notice or if you do not want to subscribe to the mailing list, please send an email to Katrin.Casel(at)hpi.de or to Andreas.Goebel(at)hpi.de.

## Description

A series of talks on current research and interesting topics in algorithm engineering and theoretical computer science. Everyone is welcome to attend talks.

If you want to give a talk about a topic in algorithm engineering or theoretical computer science, please write an e-mail to Dr. Katrin Casel or to Dr. Andreas Göbel.

## Announcements

For announcements of talks, subscribe to our mailing list.

## Talks (click on title to show abstract)

DateRoom SpeakerTitle
01.04. 11:00virtual Janosch Ruff

The talk is about a constructive proof of the Lovász local lemma. A beautiful result on its own, Robin A. Moser presented a remarkable simple algorithm to construct a combinatorial object with properties known to exist thanks to the probabilistic method. The main focus of the talk is an elegant proof technique that tries to compress the incompressible by a Kolmogorov Complexity argument giving an efficient algorithm as a byproduct. This one is definitely from "The Book". You don't have to believe in God, but you should believe in "The Book".

08.04. 11:00virtual Bachelor-Project TomTom 22/21

In this talk we will present theoretical and practical results of our Bachelor Project in the field of (electric vehicle) routing. Therefore we split our talk in two parts: In the first part we will talk about the routing speedup technique Arc-Flags. In particular we will present our approaches and ideas on explaining this algorithms efficiency using the graph parameter Skeleton Dimension. In the second part we will then visually compare electric vehicle routing to conventional routing and highlight the differences regarding travel time, costs and detours based on real world data.

22.04. 11:00virtual Bachelor-Project Valyria 22/21

In our bachelors project, we come up with machine learning models to predict the price of real estate objects. In particular, we are aiming to employ transfer learning to improve the prediction quality on domains where only little training data is available. In our presentation, we will explore four concepts that we have been working with over the past semester. We will cover some simple transfer learning ideas for Deep Neural Networks and, in contrast, two techniques to improve DNN performance in a non-transfer context. Following that, we will present two other machine learning algorithms, k-nearest-neighbour and clustering, and how we adapt them to a transfer learning setting.

16.05. 15:00Belvedere (hybrid)Aneta Neumann
Frank Neumann

In many real-world scenarios, it is necessary to handle uncertainty, and potentially disruptive events that violate constraints in stochastic settings need to be avoided. A lot of evolutionary multi-objective algorithms have recently been analyzed and applied to submodular problems with different types of constraints. This talk will provide you insights on the impact of uncertainty in advanced mine optimisation on the real-word stochastic optimisation problem which allow one to mitigate the uncertainty in the deposit while maintaining high profitability. We will also show the behavior of evolutionary multi-objective algorithms on different submodular chance constrained network problems.

Joint work with Benjamin Doerr, Carola Doerr, Frank Neumann, Simon Ratcliffe, Will Reid, Michael Stimson, Andrew M. Sutton.

Diversity plays a crucial role in the area of evolutionary computation. Traditionally, diversity has been used to enable successful crossover operations and prevent premature convergence. In recent years, computing sets of solutions that have structural or behavioural diversity has gained significant attention under the terms evolutionary diversity optimization and quality diversity. This talk will give an overview on this area of research and point out some recent results. We will cover results in the area of evolutionary diversity optimization for the classical traveling salesperson problem and show how quality diversity approaches can be used to achieve better solutions for the traveling thief problem.

Joint work with Viet Anh Do, Mingyu Guo, Aneta Neumann, Adel Nikfarjam

20.05. 11:00virtual Elazar Goldberg

In many applications one wants to measure distances between a given pair of strings. The simplest way, called Hamming distance, is scanning the strings and comparing the number of characters by which they differ. However, in many cases this is not very useful. Instead, a very common measurement is called edit distance, which counts the minimal number of edit operations needed for converting the first string into the second. Where the set of edit operations consists of character substitution, insertion and deletion. While the edit distance is more applicable, computationally, it seems to be more challenging. A possible method to bridge between the measurements is called embedding: This is a mapping that preserves distances. In this talk I'll show an embedding protocol. The correctness analysis follows by an argument concerning a drunk walk. No prior knowledge is needed. The talk is based on a joint work with Michal Koucky and Diptarka Chakraborty.

27.05. 11:00virtual Stefan Neubert

When we talk about complexity in TCS, we usually refer to the time complexity of algorithms and problems in a rather simple model. We assume a single processing unit that has access to a lot of fast memory. In practice, this is mostly not the case: huge amounts of data are spread out over distributed systems (or at least separated components of a single system) and must be combined to compute some function on that data. For such systems – but also to analyze streaming algorithms, data structures, VLSI circuits and more – it is interesting to analyze the amount of communication that has to take place to compute that function. In winter semester 2021 I gave a lecture on the basics of Communication Complexity. This talk gives an overview on the model and some interesting results from the field.

03.06. 11:00virtual Ben Bals

Human lives are increasingly influenced by algorithms, which therefore need to meet higher standards not only in accuracy but also with respect to explainability. This is especially true for high-stakes areas such as real estate valuation. Unfortunately, the methods applied there often exhibit a trade-off between accuracy and explainability. One explainable approach is case-based reasoning (CBR), where each decision is supported by specific previous cases. However, such methods can be wanting in accuracy. The unexplainable machine learning approaches are often observed to provide higher accuracy but are not scrutable in their decision-making.

In this paper, we apply evolutionary algorithms (EAs) to CBR predictors in order to improve their performance. In particular, we deploy EAs to the similarity functions (used in CBR to find comparable cases), which are fitted to the data set at hand. As a consequence, we achieve higher accuracy than state-of-the-art deep neural networks (DNNs), while keeping interpretability and explainability.

These results stem from our empirical evaluation on a large data set of real estate offers where we compare known similarity functions, their EA-improved counterparts, and DNNs. Surprisingly, DNNs are only on par with standard CBR techniques. However, using EA-learned similarity functions does yield an improved performance.

10.06. 11:00virtualLeon Schiller,
Simon Wietheger
Graph embeddings are a popular tool in Network Science and Machine Learning that helps understand the structure of complex networks. For this reason, embeddings have various practical applications such as link prediction, community detection, or visualization. We want to focus on embedding techniques that are in accordance with established generative network models as these models reliably explain the most important properties of real-world networks. One such model, that is likewise used for embeddings, are Hyperbolic Random Graphs (HRGs). However, many existing embedders for HRGs compute embeddings only in two-dimensional hyperbolic space which is in contrast to many alternative embedding techniques from Machine Learning where very high dimensional spaces are used in practice. A recent generalization of HRGs are Geometric Inhomogeneous Random Graphs (GIRGs) which allow for adjusting the dimensionality of the ground space while being technically simpler than HRGs but similarly realistic. We investigate the suitability of this model for embedding networks by generalizing embedding techniques of HRGs to this model, whereby we pay special attention to the influence of the dimensionality. Our goal is to answer the question if high-dimensional spaces are actually necessary or if a similar performance can be achieved in a low-dimensional space. Besides conducting such practical experiments, we analyze the geometric properties of the GIRG ground space and study the properties of GIRGs as a function of their dimensionality which is often assumed to be constant in the literature. Our preliminary results show that the GIRG model has geometric properties very similar to that of hyperbolic space. This suggests that it is capable of embedding hierarchical structures already in low-dimensional spaces. The goal of the thesis is to find out what GIRG-dimensionality can be considered realistic for real-world networks and to make theoretical contributions regarding the properties of GIRGs based on their dimensionality.

Unweighted Correlation Clustering is a clustering problem defined on complete graphs, where every edge is labeled either positively or negatively. The goal is to partition the vertices such that the number of negative edges inside each cluster plus the number of positive edges between different clusters is minimized. While Correlation Clustering is well established, we focus on the rather young area of Fair Correlation Clustering. There, each vertex of the graph is labeled with some color. A partition is called fair if in each of its sets the color ratio is the same as in the whole graph. Fair Correlation Clustering asks for a fair partition that minimizes the correlation clustering cost. With Correlation Clustering itself being NP-hard and the fairness constraints putting additional challenges, we start our examinations on instances where the positively labeled edges form a tree, which turns out to be already very challenging.
24.06. 11:00virtual Vanja Doskoč,
Julian Berger

We study classes of formal languages learned by computable learners imposed with certain restrictions. In particular, we compare, for each considered restriction, the set of learnable classes with each other. When requiring the learners to eventually provide a single, correct explanation of the target language (explanatory learning), the literature already provides plenty completed comparisons, depicted in lucid maps. However, when expecting the learners to provide a semantically correct, but possibly syntactically changing description of the target language (behaviourally correct learning), only partial results are known. We complete these results and provide full behaviourally correct maps for different types of data presentation. This is joint work with Timo Kötzing.

Network Creation Games are a popular tool to study the emergence of networks created by selfish agents, such as the Internet. As navigation is an important function of many networks and information about the global network structure is not always available locally, we study Network Creation Games that enforce greedy routing. We examine different variants of that model and analyze the existence of equilibria, computational hardness, as well as other game theoretic properties.
29.06. 16:00virtual Hans Gawendowicz,
Martin Schirneck

During a pandemic people have to find a trade-off between meeting others and staying safely at home. While meeting others is pleasant, it also increases the risk of infection. We consider this dilemma by introducing a game-theoretic network creation model in which selfish agents can form bilateral connections. They benefit from network neighbors, but at the same time, they want to maximize their distance to all other agents. This models the inherent conflict that social distancing rules impose on the behavior of selfish agents in a social network. Besides addressing this familiar issue, our model can be seen as the inverse to the well-studied Network Creation Game by Fabrikant et al.~[PODC 2003] where agents aim at being as central as possible in the created network. Thus, our work is in-line with studies that compare minimization problems with their maximization versions. We look at two variants of network creation governed by social distancing. In the first variant, there are no restrictions on the connections being formed. We characterize optimal and equilibrium networks, and we derive asymptotically tight bounds on the Price of Anarchy and Price of Stability. The second variant is the model's generalization that allows restrictions on the connections that can be formed. As our main result, we prove that Swap-Maximal Routing-Cost Spanning Trees, an efficiently computable weaker variant of Maximum Routing-Cost Spanning Trees, actually resemble equilibria for a significant range of the parameter space. Moreover, we give almost tight bounds on the Price of Anarchy and Price of Stability. These results imply that, compared the well-studied inverse models, under social distancing the agents' selfish behavior has a significantly stronger impact on the quality of the equilibria, i.e., allowing socially much worse stable states. (Joint work with Tobias Friedrich, Pascal Lenzner, and Anna Melnichenko)

An edge fault-tolerant diameter oracle (FDO) is a data structure that preprocesses a graph $$G$$ and, when queried with an edge $$e$$, returns the (approximate) diameter of the graph $$G-e$$ in which that edge failed. The problem of designing FDOs was originally raised by Henzinger et al. [ITCS 2017] and recently received some renewed interest by Bilò et al. [MFCS 2021]. We continue this study with a special focus on space efficienty. Bilò et al. showed that for every positive integer $$m$$ there is a graph $$G$$ with $$m$$ edges such that every FDO for $$G$$ with a stretch better than $$3/2$$ requires $$\Omega(m)$$ bits of space. The original bound was proven with a low-diameter graph and we show that this assumption is essential. For graphs with a large diameter of $$\omega(n^{5/6})$$, we give a $$(1+o(1))$$-approximate FDO taking $$\widetilde{O}(n)$$ space, braking the $$\Omega(m)$$ barrier. This raises the question of the smallest diameter that allows for $$o(m)$$-space oracles. We narrow the gap by first extending the $$\Omega(m)$$-bound up to diameter $$D = O(n/\sqrt{m})$$ and stretch better than $$3/2 - 1/D$$. Conversely, we show that for diameter $$\omega(n^{4/3} \log n/(\varepsilon \sqrt{m}))$$, $$(1+\varepsilon)$$-approximate FDOs with $$o(m)$$ space are possible. Initially, our oracle construction are randomized relying on a sampling technique that is frequently used in fault tolerance. We develop a new framework for efficient derandomization. We show its versatility by not only applying it to our own data structures but also to results from the literature. In particular, we derandomize the distance sensitivity oracle (DSO) by Ren [JCSS 2022] to obtain the first deterministic DSO with subcubic preprocessing time. We also derandomize the Single-Source Shortest Paths algorithm by Chechik and Magen [ICALP 2020]. This is joint work with Davide Bilò, Keerti Choudhary, Sarel Cohen, and Tobias Friedrich.
29.08. 11:00virtual Arthur Zahn
Many real-world networks, like the Internet, are not the result of central design but instead the outcome of the interaction of local agents who are selfishly optimizing for their individual utility. The famous Network Creation Game [Fabrikant et al,, PODC'03] enables us to understand such processes, their dynamics, and their outcomes in the form of equilibrium states. In this model, agents buy incident edges towards other agents for a price of $$\alpha$$ and simultaneously try to minimize their buying cost and their total hop distance. Since in many real-world networks, e.g., social networks, consent from both sides is required to maintain a connection, Corbo and Parkes [PODC'05] proposed a bilateral version of the Network Creation Game, in which mutual consent and payment are required in order to create edges. It is known that the bilateral version has a significantly higher Price of Anarchy, compared to the unilateral version. This is counter-intuitive, since cooperation should help to avoid socially bad states. We investigate this phenomenon by analyzing the Price of Anarchy of the bilateral version with respect to different solution concepts that allow for various degrees of cooperation among the agents. With this, we provide insights into what kind of cooperation is needed to ensure that socially good networks are created. We present a collection of asymptotically tight bounds on the Price of Anarchy that precisely map the impact of cooperation on the quality of tree networks and we find that weak forms of cooperation already yield a significantly improved Price of Anarchy. Moreover, for general networks we show that enhanced cooperation yields close to optimal networks for a wide range of edge price.
14.09. 11:00virtual Lars Seifert
A common phenomenon in urban areas is residential segregation. A well-known model for its occurrence, even in the absence of discriminatory rules and regulations, are Schelling Games, which are driven purely through the decisions of selfish agents. In such games, strategic agents of two different types occupy the nodes of a graph. The utility of an agent on a node depends on the fraction of same-type agents in her neighborhood. Then, agents try to maximize their utility either by jumping to empty nodes or swapping locations with other agents, depending on the game's type. In previous work, it is often assumed that the utility is monotonically increasing with the ratio of same-type agents. Yet, according to surveys, real-world residents prefer diverse neighborhoods. This motivated the study of Schelling Games in which the utility of an agent is not monotonically increasing but instead single-peaked at some value $$\Lambda \in (0,1)$$. In particular, Bilò et al. (2022) introduced and analysed single-peaked Swap Schelling Games. This thesis expands on their work by considering single-peaked Jump Schelling Games. More specifically, we consider two variants that differ in whether an agent's neighborhood includes the agent herself. Furthermore, we consider the effect of restricting agents to local jumps. We study the existence of and convergence to equilibria. In particular, we show that even on regular graphs and trees, convergence to equilibria is not guaranteed. For the self-inclusive variant, there are games without equilibria, even on rings. Yet, we still derive a number of conditions under which the existence of equilibria is guaranteed. Furthermore, we examine the inefficiency of equilibria by bounding the Price of Anarchy and Stability. We give tight bounds for the self-inclusive variant and show that the Price of Anarchy is unbounded for the self-exclusive variant. Moreover, we prove that deciding if an equilibrium can be reached through improving response dynamics starting from a given initial strategy profile is NP-hard. In addition, we perform experiments on torus grids which indicate that games with integration-oriented agents ($$\Lambda \leq \frac{1}{2}$$ ) converge to integrated equilibria.
28.09. 11:00virtualRishikesh Gajjala
Greenberger–Horne–Zeilinger (GHZ) states are quantum states involving at least three entangled particles. They are of fundamental interest in quantum information theory and have several applications in quantum communication and cryptography. Physicists have been designing various experiments to create high-dimensional GHZ states using multiple entangled particles. Krenn et al. have discovered a bridge connecting this problem to theoretical computer science. Namely, they found that an experiment to create a new GHZ state is associated with an edge-coloured edge-weighted multi-graph having certain properties. A graph-theoretic parameter named matching index can be defined for the corresponding unweighted uncoloured simple graph. The edge weights and the edge colourings are parameters which can be tuned by the experiment designer. The maximum dimension of the GHZ state achievable for an experiment by tuning the edge weights and the edge colourings equals the matching index of the corresponding unweighted uncoloured simple graph. Krenn conjectured that the matching index of a graph non-isomorphic to $K_4$ is at most $2$, and that of $K_4$ is $3$. This problem is surprisingly hard to prove, even on graphs as small as $K_4$. Though it was shown with extensive use of computers that the matching index of $K_4$ is $3$, an analytical proof is not known for this fact. In this talk I will present a bound on the dimension which a GHZ state can achieve with respect to the number of entangled particles when bi-coloured edges are not allowed. For the more general case, we resolve Krenn’s conjecture for some graph classes. Based on joint works with L. Sunil Chandran Some results of the results I present are available here.
11.10. 13:30virtual Leila Parsaei-Majd

Clustering is the problem of partitioning data points into clusters based on their similarity. Correlation clustering provides a method for separating the vertices of a signed graph into the optimum number of clusters without specifying that number in advance, and the main objective in this type of clustering is to minimize the sum of the number of negative edges inside each cluster and the number of positive edges between each pair of clusters. For the first time, correlation clustering was introduced by Bansal, Blum, and Chawla in 2004, and has many applications in different fields such as social networks, financial networks, and image segmentation.

In this talk, we present three algorithms. The first one is a local search to separate the vertices into two parts, and the second and third one are related to the general case. Also, by considering two conditions on the signed graphs we can obtain a 2-approximation. The best results obtained in correlation clustering are a 3-approximation for signed k-partite graphs (2015), and a 1.994-approximation for signed complete graphs (2022).