Prof. Dr. Tobias Friedrich

# Research Seminar (Summer Term 2020)

## COVID-19 and virtual seminars

Due to the current situation we are not hosting seminars at the HPI anymore. Instead, we host virtual seminars via zoom. If you want to join the seminar on short notice or if you do not want to subscribe to the mailing list, please send an email to ralf.rothenberger(at)hpi.de.

## Description

A series of talks on current research and interesting topics in algorithm engineering and theoretical computer science. Everyone is welcome to attend talks.

If you want to give a talk about a topic in algorithm engineering or theoretical computer science, please write an e-mail to Prof. Tobias Friedrich or to Ralf Rothenberger.

## Announcements

For announcements of talks, subscribe to our mailing list.

## Talks (click on title to show abstract)

DateRoom SpeakerTitle
20.04. 14:00virtualSarel Cohen

In this talk I present several results obtained during my PhD studies.

We first present the model of dynamic and fault-tolerant graph algorithms. We then focus on variants of shortest path algorithms and data-structures in setting with failures. In particular, we study the following problems.

Distance Sensitivity Oracles. An f-Sensitive Distance Oracle (f-DSO) with stretch $$\alpha$$ preprocesses a graph $$G(V,E)$$ and produces a small data structure that is used to answer subsequent queries. A query is a triple consisting of a set $$F \subset E$$ of at most $$f$$ edges, and vertices s and t. The oracle answers a query $$(F,s,t)$$ by returning a value $$\tilde{d}$$ which is equal to the length of some path between $$s$$ and $$t$$ in the graph $$G\setminus F$$ (the graph obtained from $$G$$ by discarding all edges in $$F$$). Moreover, $$\tilde{d}$$ is at most $$\alpha$$ times the length of the shortest path between $$s$$ and $$t$$ in $$G\setminus F$$. The oracle can also construct a path between $$s$$ and $$t$$ in $$G\setminus F$$ of length $$\tilde{d}$$.

Here we present several results regarding distance sensitivity oracles. First, in a joint work with Shiri Chechik, Amos Fiat and Haim Kaplan [SODA 2017] we presented, to the best of our knowledge the first nontrivial $$f$$-sensitive distance oracle with fast query time and small stretch capable of handling multiple edge failures. Specifically, for any $$f = o(\frac{\log n}{\log \log n})$$ and a fixed $$\varepsilon > 0$$ our oracle answers queries $$(F,s,t)$$ in time $$\widetilde{O}(1)$$ with $$(1+\varepsilon)$$ stretch using a data structure of size $$n^{2+o(1)}$$. For comparison, the naive alternative requires $$m^fn^2$$ space for sublinear query time.

Second, in a joint work with Shiri Checik [STOC 2020], we presented the first distance sensitivity oracle with subcubic preprocessing time and very fast query time for directed graphs with integer weights in the range $$[-M,M]$$. Our distance sensitive oracle supports a single vertex/edge failure in subcubic $$\tilde{O}(Mn^{\omega+1/2}) = \tilde{O}(Mn^{2.873})$$ preprocessing time for $$\omega=2.373$$, and near optimal query time of $$\tilde{O}(1)$$. For comparison, with the same $$\tilde{O}(Mn^{2.873})$$ preprocessing time the DSO of Grandoni and Vassilevska Williams [FOCS 12] has $$\tilde{O}(n^{0.693})$$ query time. In fact, the best query time their algorithm can obtain is $$\tilde{O}(Mn^{0.385})$$ (with $$\tilde{O}(Mn^3)$$ preprocessing time).

The Replacement Paths problem. Let G = (V,E) be a directed unweighted graph with n vertices and m edges and let P be a shortest path from $$s$$ to $$t$$ in $$G$$. The {\sl replacement paths} problem is to find for every edge $$e \in P$$ the shortest path from s to t avoiding e. Roditty and Zwick [ICALP 2005] obtained a randomized algorithm with running time of $$\tilde{O}(m \sqrt{n})$$.

In a joint work with Shiri Chechik and Noga Alon we presented the first deterministic algorithm for this problem [ICALP 2019], with the same $$\tilde{O}(m \sqrt{n})$$ time. Due to matching conditional lower bounds of Williams {\sl et. al.} [FOCS 2010], our deterministic combinatorial algorithm for the replacement paths problem is optimal up to polylogarithmic factors (unless the long standing bound of $$\tilde{O}(mn)$$ for the combinatorial boolean matrix multiplication can be improved).

The Single Source Replacement Paths Problem. Given a graph $$G=(V,E)$$ with n vertices and m edges, a source vertex $$s$$ and a shortest paths tree $$T_s$$ rooted in $$s$$, output for every vertex $$t \in V$$ and for every edge $$e$$ in $$T_s$$ the length of the shortest path from $$s$$ to $$t$$ avoiding $$e$$.

In a joint work with Shiri Chechik [SODA 2019], we presented near optimal upper bounds, by providing $$\tilde{O}(m\sqrt{n} +n^2)$$ time randomized combinatorial algorithm for unweighted undirected graphs, and matching conditional lower bounds for the SSRP problem.

23.04. 11:10virtualThomas Bläsius
Magic: The Gathering is a popular and famously complicated trading card game about magical combat. In this talk, I present a result by Churchill, Biderman, and Herrick, who show that optimal play in real-world Magic is at least as hard as the Halting Problem, solving a problem that has been open for a decade. To do this, an arbitrary Turing machine is embedded into a game of Magic such that the first player is guaranteed to win the game if and only if the Turing machine halts.
04.05. 11:00virtualThomas Bläsius
In this talk, I will present a result that we did not deem important enough to make it into the title of our paper. That being said, it is still a pretty cool result on the complexity of weighted higher-order Voronoi diagrams of random points in higher dimensions with arbitrary p-norm.
15.06. 11:00virtualDavid Stangl
Branching algorithms are ubiquitous when solving hard problems. However, most of them are oblivious to the instance they are solving: they simply apply a pre-determined set of reduction and pruning rules. We enhance such algorithms by learning an improved branching order while solving, that allows these rules to more effectively prune the search tree. In this talk we present the ongoing work of applying this principle to two versions of the Hitting Set problem.
22.06. 11:00virtualPascal Führlich
The idea is to create a web service, that allows consumers to easily search for readily usable implementations of standard algorithms and data structures. Contributors can contribute implementations, that are tested and benchmarked (learning opportunity!) on a server. The platform itself could be built in a project seminar or by tutors. As soon as the platform is available teams of students could implement specific state of the art papers or try to be faster than a boost implementation.
22.06. 11:00virtualPascal Führlich
The Swiss system is a method for determining tournament pairings in various games and (e-)sports, especially chess. A core idea is to pair players with equal score. Therefore intentionally losing (=cheating) in early rounds might lead to weaker opponents afterwards and thus, a better rank at the end. In this talk we present a simulation-based comparison of Swiss-system variants with regard to cheating.
22.06. 11:30virtualLukas Behrendt
Network creation games (NCGs) model the decentralized creation of networks by selfish agents. In his master's thesis, Hagen Echzell recently introduced two new "flow-based" games which differ from previous models in that they try to maximize the local edge-connectivity between pairs of nodes. While Hagen's work focuses on Nash equilibria for these games, I take a closer look at best responses, the optimal strategies upon which the concept of Nash equilibrium builds. In this talk, I present an algorithm that given a Minimum Flow NCG instance finds a best response for an agent in polynomial time.
29.06. 11:00virtualÁgnes Cseh

The Young Physicists Tournament is an established team-oriented scientific competition between high school students from 37 countries on 5 continents. The competition consists of scientific discussions called Fights. Three or four teams participate in each Fight, each of whom presents a problem while rotating the roles of Presenter, Opponent, Reviewer, and Observer among them.

The rules of a few countries require that each team announce in advance 3 problems they will present at the national tournament. The task of the organizers is to choose the composition of Fights in such a way that each team presents each of its chosen problems exactly once and within a single Fight no problem is presented more than once. Besides formalizing these feasibility conditions, in this paper we formulate several additional fairness conditions for tournament schedules. We show that the fulfillment of some of them can be ensured by constructing suitable edge colorings in bipartite graphs. To find fair schedules, we propose integer linear programs and test them on real as well as randomly generated data.

10.08. 11:00virtualNikhil Kumar

In this work, we bound the integrality gap and the approximation ratio for maximum plane multiflow problems and deduce bounds on the multiflow-multicut-gap. Planarity means here that the union of the supply and demand graph is planar. We first prove that there exists a multiflow of value at least half of the capacity of a minimum multicut. We then show how to convert any multiflow into a half-integer one of value at least half of the original multiflow. Finally, we round any half-integer multiflow into an integer multiflow, losing again at most half of the value, in polynomial time, achieving a 1/4-approximation algorithm for maximum integer multiflows (and hence the edge disjoint paths problem), and an integer-multiflow-multicut gap of 8.

12.08. 13:00virtualFrancesco Quinzan

Submodular functions capture the notion of diminishing returns. This notion occurs frequently in the real world, thus, the problem of maximizing a submodular function finds applicability in a plethora of scenarios. While being NP-hard and APX-hard, submodular maximization problems admit non-trivial approximation algorithms.

In this talk, I will discuss how submodularity can be used to design efficient algorithms to tackle real-world optimization problems. I will describe three relevant AI problems and, for each one of them, I will give an approximation algorithm. I will show how the notion of submodularity pertains each problem, and how it was used to design the corresponding approximation algorithms.

17.08. 11:00virtualClemens Fruböse

This talk deals with graph embeddings into spaces of non-constant curvature. I will at first review applications of graph embeddings, their advantages and flaws. These comprise in particular dealing with structures of non-hierarchical fashion, i.e. structures which do not fit in spaces of constant curvature. Without delving too much into mathematical detail, I will point out the benefits which symmetric spaces provide. I will then present some experimental results which originate in my masters thesis. These results show obstacles, which I will then discuss.

21.08. 11:00virtualKshitij Gajjar

Suppose you are given a road network with n traffic signals such that the amount of traffic on each road varies as a linear function of time. Then, the shortest path from a source s to a destination t might be different at different points of time. It was well known (Gusfield, 1980) that the number of different shortest s-t paths is at most $$n^{O(\log n)}$$.

In this talk, we will show that there are road networks sans bridges (planar road network) for which the number of different shortest s-t paths is at least $$n^{\Omega(\log n)}$$, refuting a conjecture of Nikolova (2009). This is joint work with Jaikumar Radhakrishnan (TIFR). We will also see a variant of this problem, which is part of an ongoing work with Prerona Chatterjee (TIFR) and Girish Varma (IIITH).

25.08. 11:00virtualSimon Krogmann
In facility location games, players choose locations to create facilities, which serve clients. Depending on the locations of the facilities, the clients choose which of them to visit. While these games may be played in any domain, e.g., in k-dimensional space or on a set of markets, we study facility location games on networks. In particular, we consider models in which each node represents both a client and a possible location for facilities. Each of the k players places exactly one facility on any node in the network and clients distribute their weights depending on a specific model. The utility of a player is the amount of weight her facility receives. In this work, we discuss and compare several facility location games, which differ in client behavior. We provide a common framework to describe these games and introduce three new games fitting this framework: Load Balancing with Limited Attraction Ranges, Uniform Limited Attraction Ranges, and Distance Proportional.
In the game Load Balancing with Limited Attraction Ranges, clients visit the facilities with the lowest loads within a certain distance. We show that despite this complex client behavior, player utilities can be computed in polynomial time and pure Nash equilibria exist in all instances. When measuring social welfare by the number of serviced clients, we bound the price of anarchy to between 2 - 1/k and 2. Furthermore, we show that computing the social optimum for this measure is NP-hard. We also introduce Uniform Limited Attraction Ranges, in which clients uniformly spread their weight among all facilities within a maximum distance. For this game, we prove the same properties and provide a tight upper bound for the price of anarchy of 2 - 1/k. Last, we introduce Distance Proportional, which does not admit pure Nash equilibria in all instances. In addition to our new games, we also express Voronoi Games [DT07] with our framework, a model in which clients always visit the closest facility. Finally, we provide an overview of the results of the facility location games fitting our framework.
28.08. 11:00virtualMarcus Wilhelm

Many NP-hard graph problems can be solved in polynomial time if the input instance is a tree. The same holds for input graphs that have small treewidth, a parameter describing how tree-like a graph is. While many graph problems are fixed-parameter tractable when parametrized by treewidth, in most cases this requires the computation of a tree decomposition of the input graph, which is NP-hard as well.

By contrast, an algorithm (PID-BT), submitted to the PACE Challenge 2017 achieved surprisingly fast running times on real-world networks despite the problem's hardness. This points at a big gap between the theoretical understanding of the problem and the practical performance of the algorithm.

In this thesis, we work towards reducing this gap, focussing on hyperbolic random graphs (HRGs) as a realistic model of real-world networks. On the theoretical side, we start by giving an intuitive presentation of the algorithm and show a super-polynomial lower bound on its expected running time on HRGs. This indicates that the observed performance in practice might be due to the implementation's preprocessing. We confirm this via empirical experiments, that show that the employed greedy heuristics discard a large linear part of the graph. We also explain this theoretically, by proving the existence of a linear expected number of simplicial vertices in HRGs.

01.09. 11:00virtualMartin Krejca
In the domain of nature-inspired optimization heuristics, two predominant approaches exist: evolutionary algorithms (EAs), which evolve a multiset of solutions, and estimation-of-distribution algorithms (EDAs), which evolve a probabilistic model of the search space. While both approaches have been applied to real-world problems with great success, it is often said that EDAs have an edge over EAs, as their probabilistic model provides insights into how good solutions are created, whereas EAs only provide a set of good solutions. However, this statement has to be handled with great care, as the extent to which an EDA can explain the objective function greatly depends on the complexity of its probabilistic model and on the shape of the objective function itself. Interestingly, this direction of research is not well explored so far.

In this talk, we introduce the multimodal optimization problem EqualBlocksOneMax (EBOM), which has an exponential number of optima (measured in the dimension of the problem). While all optima follow a simple pattern, EAs fail to provide a satisfactory explanation to this pattern, due to the sheer size of the set of optima, and EDAs whose models do not capture dependencies fail to reflect this pattern, due to the low complexity of their model. Consequently, we focus on EDAs with more elaborate models. We show that the arguably easiest EDA that uses dependencies in its model – the mutual-information-maximizing input clustering (MIMIC) – quickly produces a model that behaves empirically almost identically to an ideal model for EBOM. This highlights the benefit of a more complex EDA for a rather simple function when interested in explainability.

01.09. 11:30virtualMaximilian Böther
Traditional navigation services find the fastest route for a single driver. Though always using the fastest route seems desirable for every individual, selfish behavior can have undesirable effects such as higher energy consumption and avoidable congestion, even leading to higher overall and individual travel times. In contrast, strategic routing aims at optimizing the traffic for all agents regarding a global optimization goal. We introduce a framework to formalize real-world strategic routing scenarios as algorithmic problems and study one of them, which we call Single Alternative Path (SAP), in detail. There, we are given an original route between a single origin–destination pair. The goal is to suggest an alternative route to all agents that optimizes the overall travel time under the assumption that the agents distribute among both routes according to a psychological model, for which we introduce the concept of Pareto-conformity. We show that the SAP problem is NP-complete, even for such models. Nonetheless, assuming Pareto-conformity, we give multiple algorithms for different variants of SAP, using multi-criteria shortest path algorithms as subroutines. Moreover, we prove that several natural models are in fact Pareto-conform. The implementation and evaluation of our algorithms serve as a proof of concept, showing that SAP can be solved in reasonable time even though the algorithms have exponential running time in the worst case.
22.09. 13:00virtualMartin Schirneck

In non-uniform hypergraphs there exists a phenomenon unknown to graphs: some edge may be contained in another, with edges even forming chains of inclusions. In many algorithmic applications we are only interested in the collection of inclusion-wise minimal edges, called the minimization of the hypergraph.

In the video we highlight our recent results on the minization of maximum-entropy hypergraphs with a prescribed number of edges and expected edge size. We give tigh bounds on the expected number of minimal edges and briefly touch on the tools used in the proofs. The most important technical contribution is an improvement of the Chernoff-Hoeffding theorem on the tail of the binomial distribution. In particular, we show that for a random variable $$X \sim \mathrm{Bin}(n,p)$$ and any $$0 < x < p$$, it holds that $$\mathrm{P}[X \le xn] = \Theta( 2^{-\mathrm{D}(x \,{\|}\, p) n}/\sqrt{n})$$, where $$\mathrm{D}$$ is the Kullback-Leibler divergence from information theory.

This is joined work with Thomas Bläsius and Tobias Friedrich.

22.09. 13:00virtualThomas Bläsius

Unique column combinations (UCCs) are a fundamental concept in relational databases. They identify entities in the data and support various data management activities. Still, UCCs are usually not explicitly defined and need to be discovered. State-of-the-art data profiling algorithms are able to efficiently discover UCCs in moderately sized datasets, but they tend to fail on large and, in particular, on wide datasets due to run time and memory limitations. In this video, we introduce HPIValid, a novel UCC discovery algorithm that implements a faster and more resource-saving search strategy. HPIValid models the metadata discovery as a hitting set enumeration problem in hypergraphs.In this way, it combines efficient discovery techniques from data profiling research with the most recent theoretical insights into enumeration algorithms. Our evaluation shows that HPIValid is not only orders of magnitude faster than related work, it also has a much smaller memory footprint.

This is joined work with Johann Birnick, Tobias Friedrich, Felix Naumann, Thorsten Papenbrock, and Martin Schirneck.