|| Enumerating Unique Column Combinations with a Sample-and-(Don’t)-Restart Strategy
A unique column combination (UCC) is a small fingerprint of a relational database. Enumerating all inclusion-wise minimal UCCs is therefore a reoccurring task in data profiling. In this talk we explore the close connection between UCCs and the hitting set problem. We show how this connection leads to new enumeration algorithms. In particular, we present a technique that combines a sampling strategy for databases with an input-guided search heuristic to find hitting sets. A preliminary experimental evaluation indicates that our approach is not only competitive with the current state-of-the-art UCC algorithm, but can process databases that were previously out of reach for data profiling.
The talk presents ongoing research with Johann Birnick, Thomas Bläsius, Tobias Friedrich, Julius Lischeid, Felix Naumann, Kitty Meeks, and Thorsten Papenbrock.
|| Controlling Assortativitiy in Heterogeneous Geometric Random Graphs
Hyperbolic random graphs have become a popular random graph model over the last decade, as they resemble many real-world graphs.
In particular, they have a heterogeneous degree distribution, high clustering and a small-world property.
Another property of graphs that distinguishes technical and biological from social networks is degree assortativity – a measure
that describes the correlation between degrees of adjacent nodes.
Most technical and biological graphs are found to have a significant negative assortativity, while the assortativity of social
networks is usually positive. However, hyperbolic random graphs offer no way to control the degree assortativity.
We explore and compare multiple ways to extend hyperbolic random graphs or the similar geometric inhomogeneous random graphs
so that the expected assortativity can be controlled while maintaining the properties that made hyperbolic random graphs attractive
in the first place. In particular, we describe a model with controllable assortativity that has high clustering, small-world and a
heterogeneous degree distribution. We also describe how such graphs can be generated efficiently.
||Master Project Summer 2019
|| From Symmetry to Asymmetry: Generalizing TSP Approximations by Parametrization
Asymmetric instances for the travelling salesman problem are practically relevant,
but still there is a big gap in approximation guarantees between symmetric and
asymmetric instances (assuming triangle inequality in either case).
We explore this apparent difference between symmetry and asymmetry by generalizing
the tree doubling and Christofides algorithm, the two most common approximations for TSP,
to parameterized approximations for ATSP.
The parameters we consider for the respective parameterizations are upper bounded by the
number of asymmetric distances in the given instance, which yields algorithms to
efficiently compute constant factor approximations also for moderately asymmetric TSP
As generalization of the Christofides algorithm, we derive a parameterized
2.5-approximation, where the parameter is the size of a vertex cover for the subgraph
induced by the asymmetric edges.
Our generalization of the tree doubling algorithm gives a parameterized 3-approximation,
where the parameter is the number of asymmetric edges in a given minimum spanning arborescence.
Both algorithms are also stated in the form of additive lossy kernelizations, which allows
to combine them with known polynomial time approximations for ATSP.
Further, we combine them with a notion of symmetry relaxation which allows to trade
approximation guarantee for runtime.
|| Robots, 3D Scene Understanding, and the Knowledge Graph: An Internship at... Google?
Earlier this year I (somewhat suddenly) realized that, maybe, I can't
be a PhD student forever. What should I do afterwards? The two most
prominent options are: staying in academia or going into industry.
Under the impression that I know what working at university is like, I
wanted to learn more about the second option. Therefore, I did an
internship as a software engineer at a large tech company.
In this lightweight presentation I talk about how I got in, as well as
the project that I worked on. This includes robots, how to answer
questions about indoor scenes programmatically, and how that is
related to Google's Knowledge Graph. Along the way I will share what
I learned about working in industry.
|| Convergence and Hardness of Strategic Schelling Segregation
The phenomenon of residential segregation was captured by Schelling's
famous segregation model where two types of agents are placed on a grid
and an agent is content with her location if the fraction of her neighbors
which have the same type as her is at least \(\tau\), for some
0 < \(\tau\) < 1 . Discontent agents simply swap their location with a
randomly chosen other discontent agent or jump to a random empty cell.
We analyze a generalized game-theoretic model of Schelling segregation
which allows more than two agent types and more general underlying graphs
modeling the residential area. For this we show that both aspects heavily
influence the dynamic properties and the tractability of finding an optimal
placement. We map the boundary of when improving response dynamics (IRD)
are guaranteed to converge and we prove several sharp threshold results
where guaranteed IRD convergence suddenly turns into a strong
non-convergence result: a violation of weak acyclicity. In particular, we
show threshold results also for Schelling's original model, which is in
contrast to the standard assumption in many empirical papers. In case of
convergence we show that IRD find equilibria quickly.
|| Graph and String Parameters: Connections Between Pathwidth, Cutwidth and the Locality Number
We investigate the locality number, a recently introduced structural parameter for strings,
and its connection to two important graph-parameters, cutwidth and pathwidth.
These connections allow to show that computing the locality number is NP-hard but fixed parameter
tractable (when the locality number or the alphabet size is treated as a parameter),
and can be approximated with a logarithmic ratio.
As a by-product, we also relate cutwidth via the locality number to pathwidth, which is of
independent interest, since it improves the currently best known approximation algorithm for cutwidth.
In this talk, I will tell you the tale of how this collection of essentially four simple polynomial-time
reductions got accepted to ICALP.
|| L Vertex Cover and L-U Subgraph Cover
Primarily, we consider a generalization of the famous Vertex Cover Problem.
Given are an edge lengths connected graph and a parameter L.
The task is to find a minimum set of vertices R, such that every connected subgraph with a total
length of at least L has a vertex from R.
The L Vertex Cover Problem was motivated to generate length restricted subgraphs for toll enforcement,
which is also a part of the seminar.
Summarized, we will see different solution methods for both problems. Especially for linear programming
approaches, we introduce general optimization methods, which are also applicable for different
Flow-Based Network Creation Games
While a lot of research considers the design of physical or logical networks as a centralized problem,
most real-world networks develop by decentralized, uncoordinated and often market-based processes
that are driven by independent, selfish agents. In 2003, Fabrikant et al. introduced their “Network Creation Game”,
which models the creation of Internet-like networks by selfish agents without a central coordination authority:
Node agents pay a fixed price for each edge they build towards other agents to reduce the cumulative hop-distance to the other agents in the resulting graph.
Ever since, a lot of research on similar models has been conducted, establishing a whole class of “Network Creation Games”.
To the best of our knowledge, all of them have in common, that the central metric for network quality is based on the distance to other agents,
which implies an optimized latency, which is an adequate approach for a lot of network use-cases.
Nevertheless, with the rising demand for cloud storage, high quality video streaming
and also in physical networks like the power grid, bandwidth/throughput is a crucial metric.
Based on the maximum flow problem – a concept first introduced by Harris & Ross as well as Dantzig in 1951 to measure the capacity of railway networks –
my master’s thesis presents a novel kind of Network Creation Game that regards the throughput between pairs of agents in a graph as the crucial metric for network quality.
The Current State of Schelling
Residential segregation is a wide-spread phenomenon that can be observed in almost every major city.
Schelling's famous agent-based model for residential segregation explains how such clusters can form even if all agents are tolerant,
i.e., if they agree to live in mixed neighborhoods.
Schelling is an exciting, ongoing topic in the game theoretic community and this year two working groups submitted some papers back and forth.
In this talk I would like to give an overview about this line of research with focus on our last paper "Topological Influence and Locality in Swap Schelling Games"
which put us into the lead. So the current paper score is 3 to 2 in our favor.