Prof. Dr. Tobias Friedrich

# Research Seminar (Winter Term 2020)

## COVID-19 and virtual seminars

Due to the current situation we are not hosting seminars at the HPI anymore. Instead, we host virtual seminars via zoom. If you want to join the seminar on short notice or if you do not want to subscribe to the mailing list, please send an email to Katrin.Casel(at)hpi.de or to Andreas.Goebel(at)hpi.de.

## Description

A series of talks on current research and interesting topics in algorithm engineering and theoretical computer science. Everyone is welcome to attend talks.

If you want to give a talk about a topic in algorithm engineering or theoretical computer science, please write an e-mail to Dr. Katrin Casel or to Dr. Andreas Göbel.

## Announcements

For announcements of talks, subscribe to our mailing list.

## Talks (click on title to show abstract)

DateRoom SpeakerTitle
13.10. 13:00virtualMartin Krejca

Drift theory is a collection of theorems that bound the expected stopping time of time-discrete random processes over the reals that have a consistent bias – the drift – in decreasing in expectation in each step. The beauty of drift theory stems from translating a bound for the drift immediately into a bound for the expected stopping time. In other words, local information about the one-step change of the process is turned into global information about the behavior of the entire process. However, a downside of drift theory is that the application of a drift theorem is usually preceded by defining a useful potential function that transform the random process into one that has the desired drift. This step is not straightforward.

One problem in determining a useful potential function is that the transformed process needs to have drift in each step, that is, no detours into the wrong direction are allowed, even when they are deterministic. In this talk, we present the time-flexible drift theorem, a new drift theorem that does not have this limitation. It generalizes the most general drift theorem as well as similar approaches that bound the expected stopping time. The time-flexible drift theorem allows for time periods in the random process that do not show the desired drift. Such periods are handled separately and then incorporated into the final result.

This talk presents work in progress (that I came up with last week). In fact, I have no super cool application for this new theorem in mind. I stumbled upon it while thinking about something different. So I would be glad to get some feedback or ideas or start some new collaboration. As a feature, some parts of the talk are very formal (and assume a very formal background in probability theory), which should motivate you to have an in-depth discussion with me after the talk if you would like to learn more =D. Nonetheless, the talk also includes the mandatory (somewhat) colorful pictures with fancy animations and is aimed to convey the intuition of the math beneath.

20.10. 13:00virtualArdalan Khazraei
The cost-distance Steiner tree problem asks for a Steiner tree in a graph that minimizes the total cost plus a weighted sum of path delays from the root to the sinks. We present an improved approximation for the uniform cost-distance Steiner tree problem, where the delay of a path corresponds to the sum of edge costs along that path. Previous approaches deploy a bicriteria approximation algorithm for the length-bounded variant that does not take the actual delay weights into account. Our algorithm modifies a similar algorithm for the single-sink buy-at-bulk problem by Guha et al. [2009], allowing a better approximation factor for our problem. In contrast to the bicriteria algorithms it considers delay weights explicitly. Thereby, we achieve an approximation factor of (1+β), where β is the approximation factor for the Steiner tree problem. This improves the previously best known approximation factor for the uniform cost-distance Steiner tree problem from 2:87 to 2:39. This algorithm can be extended to the problem where the ratio of edge costs to edge delays throughout the graph is bounded from above and below. In particular, this shows that a previous inapproximability result (Chuzhoy et al. [2008]) requires large variations between edge delays and costs. Finally, we present an important application of our new algorithm in chip design. The cost-distance Steiner tree problem occurs as a Lagrangean subproblem when optimizing millions of Steiner trees with mutually depending path length bounds. We show how to quickly approximate a continuous relaxation of this problem with our new algorithm.
23.10. 13:00virtualGregor Lagodzinski

During the first half of this year I worked at SAP as an intern, to be precise, at the SAP Innovation Center Network location in Potsdam. The goal of this internship was to see, observe and take part of the world outside of academia. Needless to say, things did not go as planned and I got first-hand experience into how a big company like SAP handles a special situation like the COVID-19 pandemic. First and foremost, they will still work and so did I.

In this lightweight talk I will present my insights gained during the internship. This will incorporate a narrative about the process before the internship, a very rough insight in my tasks, and a first example of what a Zero-knowledge proof is. Finally, I will present my personal résumé and an outlook as to how it affected my near future.

10.11. 11:00virtualArthur Zahn
Time series are sequences of data indexed by time. Such data are collected in various domains, often in massive amounts, such that storing them proves challenging. Thus, time series are commonly stored in a compressed format. An important compression approach is piecewise linear approximation (PLA), which only keeps a small set of time points and interpolates the remainder linearly. Picking a subset of time points such that the PLA minimizes the mean squared error to the original time series is a challenging task, naturally lending itself to heuristics. We propose the piecewise linear approximation genetic algorithm (PLA-GA) for compressing time series by PLA. The PLA-GA is a memetic (μ+λ) GA that makes use of two distinct operators tailored to time series compression. First, we add special individuals to the initial population that are derived using established PLA heuristics. Second, we propose a novel local search operator that greedily improves a compressed time series. We compare the PLA-GA empirically with existing evolutionary approaches and with a deterministic PLA algorithm, known as Bellman's algorithm, that is optimal for the restricted setting of sampling. In both cases, the PLA-GA approximates the original time series better and quicker. Further, it drastically outperforms Bellman's algorithm with increasing instance size with respect to run time until finding a solution of equal or better quality -- we observe speed-up factors between 7 and 100 for instances of 90,000 to 100,000 data points.
11.11. 10:15virtualMaximilian Langner
Local shtukas can be viewed as the analogs to p-divisible groups in the case of function fields. We compute a special Rapoport-Zink space associated with a family of local shtukas and its period morphism. Furthermore, we compare this moduli space to its analog in mixed characteristic.
18.11. 10:15virtualStefan Neubert

Basic textbook complexity analysis usually inspects the total time of an algorithm. Whilst this yields important insights into the overall performance of the algorithm, it often fails to convey meaning in terms of the problem: *Why* does this problem seem to take this time to be solved?

In our work, we instead analyze how fast one can compute meaningful parts of a solution, thereby identifying which aspects of a problem seem to be hard (inefficient) to solve. Specifically, we proof both upper and lower bounds on the *delay* in-between produced partial solutions of shortest distance problems on graphs.

From a theory point of view, this provides a better understanding of the analyzed problems and can possibly help in developing faster algorithms. Additionally, the results provide insights into how one can solve the analyzed problem within a data processing pipeline to speed up subsequent steps that can work on partial solutions instead having to wait for all parts.

19.11. 16:00virtualKatrin Casel

In a typical introductory course on theoretical computer science one learns that polynomial time reductions are used to show that a computational problem cannot be solved in polynomial time (unless P=NP). It seems that beyond this plain NP-hardness conclusion there is not much to be learned from such reductions; they seem to merely serve as justification to actually start studying the problem at hand since it is officially classified as being hard.

This talk will remedy the false reputation of reductions as blunt instruments for NP-hardness. Reductions in fact yield powerful connections between seemingly unrelated problems that reveal useful similarities. To support this claimed power of reductions, we will look at examples from several application areas which illustrate the different types of information that can be transferred. With the help of appropriate reductions it is possible to inherit approximation results (approximation-preserving reductions) or to derive strict specific running time lower bounds (fine-grained reductions). Further, reductions can give us insight into the limits of particular algorithmic strategies but also translate successful strategies from one problem to another.

25.11. 10:15virtualSarel Cohen

In this talk I will review several results in Deep Learning and present relevant research directions. First, I'll present two existing research directions. The first relates to combinatorial optimzation using deep learning - we describe two previous results where deep neural networks are used to solve combinatorial optimization problems. First we will discuss about pointer networks [NIPS 2015] which are used to compute approximations of the convex hull of a set of points. The second result "Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search" [NIPS 2018] use deep neural networks to compute approximate solution to NP-hard problems such as Maximum Independent Set, Maximum Vertex Cover and Maximal Clique. We present a new way to cope with NP hard problems using deep neural networks. The second line of research relates to computer vision tasks. More specifically we explore the identification of mask wearing in images and videos. We present deep neural networks for detecting faces in images and classifying them as either wearing masks or not-wearing masks (or wearing masks incorrectly). Later we will try to measure mask wearing in TV news broadcasts and correlate it with either mask wearing in the general public or even with COVID-19 infection rates. As time permits we will present additional research directions related to deep learning.

02.12. 10:15virtualMaster Project 2020

In this talk, we will give a short introduction to the topic of last semesters master project and the results we achieved. In the project we studied a recently proposed Euclidean Generalized Network Creation Game by Bilò et al.[SPAA 2019] and we investigated the creation of (beta,gamma)-networks, which are in beta-approximate Nash equilibrium and have a total cost of at most gamma times the optimal cost. In our model we have n agents corresponding to points in Euclidean space create costly edges among themselves to optimize their centrality in the created network. Our main result is a simple O(n^2)-time algorithm that computes a (beta,beta)-network with low beta for any given set of points. Along the way, we significantly improve several results from Bilò et al. and we asymptotically resolve a conjecture about the Price of Anarchy.

09.12. 10:15virtualJannik Peters,
Davis Issac
Popular Matchings have been one of the most popular topics in matching theory. In my master thesis work so far, I considered popular matchings on a restricted type of three dimensional matching, where the preferences are cyclically ordered. I will give a short overview over the structural and NP-hardness results I have found so far and I will talk about promising directions for further research.

I will talk about the fixed parameter tractability of (weighted) edge clique partition problem. The talk is based on my paper at IPEC 2020 co-authored with Andreas Feldmann and Ashutosh Rai. We develop an FPT algorithm and a compression for the Weighted Edge Clique Partition (WECP) problem, where a graph with $$n$$ vertices and integer edge weights is given together with an integer $$k$$, and the aim is to find $$k$$ cliques, such that every edge appears in exactly as many cliques as its weight. The problem has been previously only studied in the unweighted version called Edge Clique Partition (ECP), where the edges need to be partitioned into $$k$$ cliques. It was shown that ECP admits a kernel with $$k^2$$ vertices [Mujuni and Rosamond, 2008], but this kernel does not extend to WECP. The previously fastest algorithm known for ECP has a runtime of $$2^{O(k^2)}n^{O(1)}$$ [Issac, 2019 (PhD Thesis)]. For WECP, we develop a compression (to a slightly more general problem called Annotated WECP) with at most $$4^k$$ vertices, and an algorithm with runtime $$2^{O(k^{3/2}w^{1/2}\log(k/w))}n^{O(1)}$$, where $$w$$ is the maximum edge weight. The latter in particular improves the runtime for ECP to $$2^{O(k^{3/2}\log k)}n^{O(1)}$$.
16.12. 10:15virtualRalf Rothenberger

Satisfiability is considered the canonical NP-complete problem and is used as a starting point for hardness reductions in theory, while in practice heuristic SAT solving algorithms can solve large-scale industrial SAT instances very efficiently. One possible avenue towards understanding the power and limitations of modern SAT solvers is via proof complexity.

In this talk I give a short introduction to the celebrated DPLL and CDCL SAT solving algorithms and their connection to the resolution proof system. This connection can be used to derive lower bounds on the running time of those algorithms for families of SAT instances.

6.01. 10:15virtualWilhelm Friedemann
Given a vertex-weigthed graph G (negative weights permitted), the 'Maximum Weight Connected Subgraph Problem' (MWCS) is to find the connected subgraph G' of G that maximizes the sum of the weights of all vertices in G'. MWCS is known to be NP-hard and W[2]-hard with regards to the weight of the subgraph as parameter. For my master thesis, I investigate whether MWCS is in FPT with regards to the number of vertices in G'. I will present my current results, which place MWCS in FPT, if we already know the solution size. Furthermore, I will talk about my next steps and discuss some problems with you.
13.01. 10:15virtualNikhil Kumar

I will talk about a scheduling problem whose objective is to minimize the total energy consumption. In particular, given n jobs with release dates, deadlines and processing times, we wish to find a minimum energy schedule of the jobs on a processor with a sleep state. A processor with a sleep state can be in one of the two states: active (processes one unit of job/time and consumes 1 unit energy/time) and sleep (doesn't do any processing or consume energy). Furthermore, switching between active and sleep state takes a fixed amount of energy.

Even after more than a decade of work, we still lack a good understanding of the problem. In the talk, I will briefly discuss known results and a recent work which uses LP relaxation to find good solutions. I will end with plenty of interesting research directions.

20.01. 10:15virtualAnna Melnichenko

Understanding real-world networks has been a core research endeavor throughout the last two decades. Network Creation Games are a promising approach for this from a game-theoretic perspective. In these games, selfish agents corresponding to nodes in a network strategically decide which links to form to optimize their centrality. Many versions have been introduced and analyzed, but none fits to modeling the evolution of social networks. In real-world social networks, connections are often established by recommendations from common acquaintances or by a chain of such recommendations. Thus establishing and maintaining contact with a friend of a friend is easier than connecting to strangers. This explains the high clustering, i.e., the abundance of triangles, in real-world social networks.

We propose and analyze a network creation model inspired by real-world social networks. Edges are formed in our model via bilateral consent of both endpoints, and the cost for establishing and maintaining an edge is proportional to the distance of the endpoints before establishing the connection. We provide results for generic cost functions, which essentially only must be convex functions in the distance of the endpoints without the respective edge. Besides the theoretical analysis of the model, we show via experiments that the equilibrium networks of our model indeed closely mimics real-world social networks. This can be seen as a promising first step towards game-theoretic network creation models that predict networks featuring all core real-world properties.

27.01. 10:15virtualPascal Führlich

The Swiss system is a method for determining tournament pairings in various competitive games and (e-)sports, especially chess. For a predetermined number of rounds players with equal score are paired. We present a novel pairing engine based on Edmonds' maximum weight matching algorithm. In comparison to the state-of-the-art pairing engines used at top-level chess tournaments, our approach features much simpler rules, is easier to understand and verify for players, produces fairer pairings, and results in rankings that reflect the players' playing strengths better. We also present our findings on situations in which a player can improve her rank by doing a Swiss gambit: intentionally losing a match in a Swiss-system tournament.

03.02. 10:15virtualGregor Lagodzinski

Modern communication and society at large is reigned by networks of many different types: product chains, social networks, street maps,... The underlying graphs grow steadily and are in many cases THE secret of the owning company. When Facebook claims to have 2 Billion users we are not allowed to check the claim as Facebook will not provide us with the network. The incentive for false claims is definitely given as there is a correlation between network parameters like the size and the stock value. The question arises of how we can verify a claim like the size of a graph without access to the graph.

Katzir, Shikhelman, and Yogev tackled this question [CRYPTO'20]. They gave an interactive proof protocol that estimates the size of a graph within a multiplicative error when the verifier has access to a neighbourhood oracle.

In this talk, I will first present the protocol of Katzir et al. before initiating a discussion on the limitations of said protocol. For this second part, the audience will be split into smaller groups and is expected to discuss the protocol and limitations within these groups. Afterwards, we will re-merge and discuss the protocol together.

10.02. 10:15virtualTimo Kötzing,
Sarel Cohen
I had a projectseminar this semester where the one student taking it built some web-interface for quickly designing and running an evolutionary algorithm. Since it was just one student, we didn't get to do all the exciting things that I had in mind with this interface. If you are interested in joining me for doing some run time experiments and publishing them, let me know!

Fault-Tolerant Fixed Parameter Algorithms
17.02. 10:15virtualSamuel Baguley Wormald's Theorem for the Analysis of Algorithms
02.03. 10:15virtualDavid Stangl
Branching algorithms are used ubiquitously when solving hard problems. Among them, SAT solvers are perhaps the most highly optimized class, benefiting from decades of research. Over the last twenty years, one of the largest single improvements to them was the introduction of VSIDS, a learning branching heuristic. In our work, we attempt to generalize VSIDS to other branching algorithms, using a relatively simple Minimum Hitting Set algorithm as our base. Comparing our heuristic to uniformly-random branching, we observe significant reductions in both the size of the search tree and overall runtime, ranging up to 837 times.
03.03. 10:15virtualMerlin de la Haye,
Hans Gawendowicz
Wireless networks can be a viable alternative to traditional wire-line networks, providing increased flexibility in creating and dynamically modifying the resulting networks. Since the actors in such networks can be mobile devices powered by batteries, such as smart phones, each device has to find a good trade-off between power consumption and service quality. My thesis aims to investigate the formation of wireless networks by means of algorithmic game theory and to analyze some of their properties, such as the existence and quality of stable states.

During a pandemic, people have to weigh between meeting others and staying at home. While meeting others is pleasant, it also increases the risk of infection. I will model and analyze this dilemma from a game theoretical standpoint, inspired by the much discussed Network Creation Games.
10.03. 10:15virtualBP 20/21 (Sebastian Angrick, Ben Bals, Niko Hastrich, Maximilian Kleissl, Jonas Schmidt, Lukas Weyand)

This interim presentation will be a window into the ongoing work of the Bachelor's project 2020.

Given a million real-estate valuation data points and scientific freedom, we aim to enhance the current valuation process. As the core goal of the project, we want to investigate and understand the underlying structure of the data and learn relations not revealed by conventional methods.

We will provide an overview of our two approaches to the problem: deep neural networks and clustering.

We apply a Deep Neural Network that estimates square meter prices. Since DNNs don't have the limitation of being explainable, we hope to get as good results as the domain experts do. Additionally, we can use DNNs to predict house prices of imaginary real estates to get information that can be used to answer further questions about the real estate market.

We apply a simple, hierarchical clustering algorithm on reachability graphs to gain local clusters that reveal the structure of location as a defining factor for valuation. This approach is more explainable as the clustering procedure can be understood and visualizations of the clustering can serve as witnesses for valuation.

In this talk we're happy to discuss our current limitations and how to overcome them with you!

17.03. 10:15virtual Lukas Behrendt
The Minimum Flow Network Creation Game models the decentralized creation of a network by selfish agents which are trying to maximize the network’s bandwidth. Agents are modeled as nodes in a weighted graph and can spend a uniform budget to buy additional edges toward other agents. An agent a’s utility function is given by the graph’s edge connectivity, using the number of nodes with higher than minimum local edge connectivity to a as a tie-breaker. In my thesis, I present an algorithm that computes a best response for the Minimum Flow Network Creation Game in polynomial time, introducing the concepts of sink-minimal mincuts and sink-minimal cut trees along the way.
24.03. 10:15virtualJannik Peters

Every year the fifth semester bachelor students at the HPI have to choose which bachelor project they want to partake in. For this they send a ranking of their top five projects to the Studienreferat. Based on these top five projects a magic algorithm sorts the students into respective projects.

In my talk I present how to model this scenario as an instance of the so called House Allocation with Lower and Upper Quotas problem and I discuss under which parameterizations or structural restrictions the problems of computing and verifying popular or perfect Pareto optimal matchings become intractable. As a special case, I answer an open question and show that a perfect Pareto optimal matching can be found (and verified) in polynomial time, if the maximum lower quota of the instance is 2.

26.03. 11:00virtual Olaf Parczyk

Embedding structures with specific properties is a fundamental problem in graph theory. In this talk I will discuss two different graph models that have a probabilistic and a deterministic ingredient. These combinations of random and deterministic structures interpolate between the pure models and exhibit interesting phenomena.

The first model is the union of a dense graph with large minimum degree and a binomial random graph. Depending on the point of view and choice of parameters one of the parts compensates for the flaws of the other in satisfying certain properties. In the second model we consider the binomial random graph and allow an adversary to remove a fraction of the edges at each vertex. This measures the robustness of the random graph with respect to a given property. There has been a significant amount of research on both models sparking from results on Hamiltonicity. I will introduce the models and highlight some recent developments.

31.03. 10:15virtual Philipp Fischbeck
Bootstrap percolation is a classical model for the spread of information in a network. In the round-based version, nodes of an undirected graph become active once at least r neighbors were active in the previous round. We propose the perturbed percolation process, a superposition of two percolation processes on the same node set. One process acts on a base graph with activation threshold 1, while the other acts on a random graph with threshold r (representing local and global contacts, respectively). We consider a class of grid-like base graphs, in which the bootstrap percolation process only spreads polynomially fast. As random graph, we consider Erdős–Rényi graphs. For r=1, the perturbed percolation process percolates exponentially fast, as the diameter is known to be low. For r>=2 however, we prove that the process initially spreads polynomially fast, with no activations via the random graph, and eventually the activation rate becomes exponential, driven only by the random graph. We complement our theoretical results by empirical analyses of this process. We observe the predicted behavior not only for the graphs theoretically considered, but also additional random graph models.
07.04. 10:15virtualMichael Vaichenker
TBA