Prof. Dr. Tobias Friedrich

# Research Seminar (Winter Term 2021)

## COVID-19 and virtual seminars

Due to the current situation we are not hosting seminars at the HPI anymore. Instead, we host virtual seminars via zoom. If you want to join the seminar on short notice or if you do not want to subscribe to the mailing list, please send an email to Katrin.Casel(at)hpi.de or to Andreas.Goebel(at)hpi.de.

## Description

A series of talks on current research and interesting topics in algorithm engineering and theoretical computer science. Everyone is welcome to attend talks.

If you want to give a talk about a topic in algorithm engineering or theoretical computer science, please write an e-mail to Dr. Katrin Casel or to Dr. Andreas Göbel.

## Announcements

For announcements of talks, subscribe to our mailing list.

## Talks (click on title to show abstract)

DateRoom SpeakerTitle
15.10. 11:00virtualJatin Batra

We study the generalized min sum set cover (GMSSC) problem, wherein given a collection of hyperedges E with arbitrary covering requirements k_e, the goal is to find an ordering of the vertices to minimize the total cover time of the hyperedges; a hyperedge e is considered covered by the first time when k_e many of its vertices appear in the ordering. We give a 4.642 approximation algorithm for GMSSC, coming close to the best possible bound of 4, already for the classical special case (with all k_e=1) of min sum set cover (MSSC) studied by Feige, Lovász and Tetali, and improving upon the previous best known bound of 12.4 due to Im, Sviridenko and van der Zwaan. Our algorithm is based on transforming the LP solution by a suitable kernel and applying randomized rounding. This also gives an LP-based 4 approximation for MSSC. As part of the analysis of our algorithm, we also derive an inequality on the lower tail of a sum of independent Bernoulli random variables, which might be of independent interest and broader utility.

Another well-known special case is the min sum vertex cover (MSVC) problem, in which the input hypergraph is a graph and k_e=1, for every edge. We give a 16/9 approximation for MSVC, and show a matching integrality gap for the natural LP relaxation. This improves upon the previous best 1.999946 approximation of Barenholz, Feige and Peleg. Finally, we revisit MSSC and consider the ell_p norm of cover-time of the hyperedges. Using a dual fitting argument, we show that the natural greedy algorithm achieves tight, up to NP-hardness, approximation guarantees of (p+1)^{1+1/p}, for all p>=1. For p=1, this gives yet another proof of the 4 approximation for MSSC.

29.10. 11:00virtualMartin Schirneck

An upper dominating set in a graph is an inclusion-wise minimal dominating set of maximum cardinality. Berge and Duchet in 1975 showed how to compute the upper domination number $$k^*$$ of an $$n$$-vertex graph in time $$O(n^{k^*+3})$$. Conversely, Bazgan et al. [TCS 2018] proved that $$k^*$$ cannot be computed in time $$f(k^*) n^{o(\sqrt{k^*})}$$ for any computable function $$f$$, unless the Exponential Time Hypothesis (ETH) fails. Very recently and independently of each other, Araújo et al. [ArXiv 2021] as well as Dublois, Lampis, and Paschos [CIAC 2021] closed this gap in ruling out any algorithm running in time $$f(k^*) n^{o(k^*)}$$ under ETH.

In the talk, we walk through the proof of Araújo et al. and discuss some open problems regarding the transversal rank of a hypergraph.

5.11. 11:00virtualNico Klodt
Contact processes are used to model various phenomena such as the spread of viruses and information through a network. An interesting aspect of a contact process is whether such an 'infection' becomes epidemic or dies out fast. In the SIS model for contact processes, the underlying network is represented by a graph. Each vertex of that graph can either be healthy or infected. Infected vertices can randomly infect their neighbors and become susceptible to the infection again when they heal. The infection rate λ of the process indicates how frequently vertices infect their neighbors. We analyze the survival time of the contact processes depending on λ on stars and clique stars. Clique stars are stars with additional edges, that connect the leaves to multiple cliques of equal size. We show a smooth transition between fast die out and long survival on stars and clique stars and prove that the additional edges of a clique star only impact the survival time notably if the cliques are sufficiently large.
12.11. 10:00virtualGeorge Skretas
Dynamic networks is a research area in the field of theoretical computer science tasked with investigating the algorithmic and structural properties of networked systems whose structure changes over time. The algorithmic theory of dynamic networks has risen in popularity due to the multitude of motivating systems that exist such as wireless systems, reconfigurable robotics and biological systems. Most of the work focuses on passively dynamic networks where the dynamic changes are decided by an external scheduler. In the past few years, there is a growing interest in actively dynamic networks where the changes are controlled by the network itself. This allows the network to reconfigure its own structure in order to more efficiently solve different tasks. In this talk, I will briefly introduce different models in the area of actively dynamic networks and continue by focusing on one such model where given an initial network G_s, the goal is to program the nodes with a distributed algorithm in order to reconfigure themselves, via edge activations/deactivations into a target network G_t with a small diameter.
19.11. 11:00virtualHans Gawendowicz
During a pandemic, people have to find a trade-off between meeting others and staying at home. While meeting others is pleasant, it also increases the risk of infection. We consider this dilemma by introducing a game-theoretic network creation model in which selfish agents can form bilateral connections. They benefit from their neighbors but at the same time they want to keep large distances to all the other agents. We look at two variants of this model. In the first variant, there are no restrictions on the connections being formed. In the second variant, we restrict which connections can be formed by introducing a host network defining the possible connections. We characterize optimal and equilibrium networks and give bounds on the Price of Anarchy and Price of Stability. For that, we also introduce the concept of Maximum Routing-Cost Spanning Trees, which are spanning trees maximizing the sum of distances between all pairs of nodes.
26.11. 11:00virtualWilhelm Friedemann
While last week was about responsible social distancing, this week will be about the opposite: Hosting the best parties in town. The question is - given a graph representing friendship relationships and a weight function representing each friend's coolness - how to select the group of friends who know each other (are connected) and maximize their summed coolness. Apart from party planning, the problem appears as the 'Maximum-Weight Connected Subgraph Problem (MWCS)' in many real-world domains, ranging from biological network analysis, where one wants to find signaling pathways in protein networks, to video-processing. In the talk, I present the results of my Master's thesis: a variety of fixed-parameter intractability results for the MWCS with respect to the weight and cardinality of the solution. But don't worry, I also provide two FPT algorithms, if you know (due to current pandemic constraints) the maximum number of friends you are allowed to invite.
17.12. 10:00virtualMichael Vaichenker,
Bashini Mahaarachchi
Fair Ratio-Cut is the problem of finding a partitioning of a colored graph that minimizes the ratio between edges cut and the balance of the partitioning, while containing each color to equal amounts within each resulting partition. The consideration of balance yields often more meaningful partitionings and enables the use of divide and conquer approaches for other problems. Fairness constraints allow to mark sensitive attributes in order to avoid solutions that discriminate with regard to the marked attributes. Despite Fair Clustering being a quite popular topic, graph-cut problems were considered only recently under fairness constraints. Fair Ratio-Cut in particular remains mostly a blank slate. My master's thesis contributes first theoretical results for this problem. In this talk I will give you an overview over my results and showcase a proof that Fair Ratio-Cut with many colors cannot be approximated much better than with a factor of $$n^2$$, assuming P ≠ NP.

Managing vehicle parking infrastructure efficiently is important for cities in order to improve traffic planning and avoid congestions. Having a parking space map helps drivers to know where they can park exactly without needing to roam around until they find a vacant parking spot, which directly helps to reduce the CO2 footprint. One way to create such a parking space map is by analysing the aerial images collected by satellites and applying machine learning methods to extract parking spaces information from them. Thus, we introduce a parking space detection machine learning framework that involves pre-processing of satellite images using image augmentation and fusion techniques, and then using a mask-region based convolutional neural network with a feature pyramid network backbone (Mask RCNN FPN) to detect parking spaces on the processed images. We also propose to use a logistic regression model training process as a component of this pipeline in order to improve the performance of the parking space detection framework. The method we propose can be employed successfully to create parking space maps with precise location information using satellite images.
14.01. 10:00virtual Martin Schirneck

A distance sensitivity oracle (DSO) with sensitivity $$f$$ is a data structure that reports the pair-wise distances in a given graph, even when up to $$f$$ specified edges fail. A parameterized decision problem is said to be fixed-parameter tractable (FPT) if, on $$n$$-vertex graphs with parameter $$k$$, the problem can be solved in time $$O(g(k) \cdot poly(n))$$ for some function $$g$$. We combine both ideas and introduce sensitivity oracles for FPT graph problems. Such data structures preprocess the input in time $$O(g(f,k) \cdot poly(n))$$ and report the answer to the graph problem in the presence of up to $$f$$ edge failures in time significantly faster than merely re-running the fastest known FPT algorithm on the respective subgraph. As examples, we present multiple oracles for the $$k$$-Path and the Vertex Cover problem, respectively. Our solutions involve careful trade-offs between the parameters $$f$$ and $$k$$, the space requirement, preprocessing time, as well as the query time.

This is joint work with Davide Bilò, Katrin Casel, Keerti Choudhary, Sarel Cohen, Tobias Friedrich, Gregor Lagodzinski, and Simon Wietheger.

21.01. 11:00virtualMerlin de la Haye
Wireless networks can be a viable alternative to traditional wire-line networks, providing increased flexibility in creating and dynamically modifying the resulting networks. However, the reliability of communication paths decreases with increasing path lengths. The maximum hop count in a network captures this property nicely, and can be used as an indicator for wireless network stability. With the tools of Algorithmic Game Theory, we model this setting with selfish agents, and introduce a wireless version of the Network Creation Game by Fabrikant et al., with a focus on bidirectional connections and different limitations on the maximum hop count in the network. In this setting, we analyze the structure of social optimum and equilibrium networks, focusing on uniformly distributed nodes along a line in 1D. For these networks, we present construction templates for hop counts limited to 2 and 3 hops, and explain how equilibria for arbitrary hop counts can be constructed. Regarding the quality of equilibrium networks, we prove a constant Price of Anarchy for the uniform 1D 2-hop model, and a sqrt(n) bound for both the Price of Stability and Price of Anarchy in the 3-hop case.
28.01. 11:00virtualArthur Zahn
Many real-world networks like the internet are not centrally designed but instead the product of local agents who selfishly try to optimize their own situation. Network Creation Games are a game theoretic tool to analyze exactly such processes. The agents are modelled of vertices in a graph and an agent can build an edge to a different agent by paying a fixed price $$\alpha$$. Each agent selfishly tries to minimize the cost it pays for buying edges and the summed distance to all other agents in the graph - these are two conflicting goals. We consider the bilateral version of Network Creation Games, where an edge {u,v} is only built if both u and v are willing to pay for it. For instance, this version captures social networks more accurately, because being friends cannot be done unilaterally. In particular, we analyze the price of anarchy for such networks, which is the ratio between the worst equilibrium network and the centrally designed social optimum.

Schelling Facility Location is an agent based game in which there are customers and facilities. The binary colored customers want to maximize the ratio of equally colored customers served by the same facility, while the facilities want to maximize the number of customers they serve. This setting is analyzed for its game theoretic properties, such as the existence of equilibria and the price of anarchy.
25.02. 11:00virtual Master-Project 22/21
Lars Seifert

While crossover is ubiquitous in evolutionary computing, many theoretical analyzes have a strong focus on operators varying only a single parent individual. In order to understand better how and why crossover can benefit optimization, we consider pseudo-Boolean functions with an upper bound $$B$$ on the number of 1s allowed in the bit string (cardinality constraint). For the natural translation of the OneMax test function to this setting, the literature gives a bound of $$\Theta(n^2)$$ for the (1+1) EA. Part of the difficulty when optimizing this problem lies in having to improve individuals meeting the cardinality constraint by flipping both a 1 and a 0. The experimental literature proposes balanced operators, preserving the number of 1s, as a remedy. We show that a balanced mutation operator optimizes the problem in $$O(n\log n)$$ if $$n-B=O(1)$$. However, if $$n-B=\Theta(n)$$, we show a bound of $$\Omega(n^2)$$, just as classic bit flip mutation. Crossover and a simple island model gives $$O(n^2\log n)$$ (uniform crossover) and $$O(n\sqrt n)$$ (3-ary majority vote crossover). For balanced uniform crossover with Hamming distance maximization for diversity we show a bound of $$O(n\log n)$$. As an additional contribution we analyze and discuss different balanced crossover operators from the literature.

Residential segregation can be described using so called Schelling Games, where strategic agents of different types "live" on a graph and their utility depends on the fraction of same type agents in their neighborhood. To improve their utility, they can either swap with other agents or move (technically jump) to other, currently empty, nodes of the graph, until they reach equilibrium (or don't). Typically, the agents are assumed to be slightly shapist polygons or to have some other intolerant tendency: Their utility function is monotone, i.e. the higher the fraction of same-type agents in their neighborhood, the higher their utility. However, social studies show that (most) humans actually prefer diversity in their residential area. Thus [Bilò et al, tbd] suggest a modified Swap Schelling Game, where the utility function f(x), is single peaked, f.e. agents have maximal utility when their neighborhood has a 50-50 composition. The goal of this master thesis will be to analyse the Jump Schelling Game with this single peaked utility function. In particular, typical game-theoretic questions like the existence of equilibria and properties of the equilibria (price of anarchy/stability) are of interest.
11.03. 11:00virtualKirill Simonov

Fair clustering is a variant of constrained clustering where the goal is to partition a set of colored points. The fraction of points of each color in every cluster should be more or less equal to the fraction of points of this color in the dataset. We present a new construction of coresets for fair k-means and k-median clustering for Euclidean and general metrics based on random sampling. For the Euclidean space, we provide the first coresets whose size does not depend exponentially on the dimension d. For general metric, our construction provides the first coreset for fair k-means and k-median. With the help of the coreset, we design better approximation and streaming algorithms for fair and other constrained clustering variants. In particular, we obtain the first fixed-parameter tractable PTAS for fair k-means and k-median clustering in R^d and the first FPT "true'' constant-approximation for metric fair clustering.

18.03. 11:00virtualRalf Rothenberger

In this talk I introduce the Minimum Linear Arrangement Problem and our recent and upcoming results on it. But more importantly, I will depict a classical journey of researching a problem with its ups and downs, detours, and lucky (or unlucky) accidents until you are finally able to prove...something! Disclaimer to fresh PhD students: This is how research really works! Don't be discouraged.

25.03. 11:00virtual Timo Kötzing

I consider a simple noisy optimization problem: the value of a bit string is its number of 1s plus a random variable drawn from a (centered) Gaussian distribution. It was no surprise to me that simple hill-climbers (like random local search or the 1+1 Evolutionary Algorithm) cannot handle even small noise levels. It was a big surprise to me that some other algorithms (like some estimation of distribution algorithms and crossover-based algorithms) can deal even with big noise effectively. I came to understand this latter phenomenon as essentially the result of averaging over many iterations, and some parameters have to be scaled up in order to deal with higher noise.

However, even considering these effects, some algorithms were unreasonably effective. In my talk I want to discuss some further findings that seem to indicate that, for example, the compact genetic algorithm can handle Gaussian noise with O(n) variance (which is a lot!) without any scaling of parameters or any other tricks -- just out of the box. This is thanks to sampling widely different individuals a lot of the time, which I will elaborate on in my talk.