|| Enumerating Unique Column Combinations with a Sample-and-(Don’t)-Restart Strategy
A unique column combination (UCC) is a small fingerprint of a relational database. Enumerating all inclusion-wise minimal UCCs is therefore a reoccurring task in data profiling. In this talk we explore the close connection between UCCs and the hitting set problem. We show how this connection leads to new enumeration algorithms. In particular, we present a technique that combines a sampling strategy for databases with an input-guided search heuristic to find hitting sets. A preliminary experimental evaluation indicates that our approach is not only competitive with the current state-of-the-art UCC algorithm, but can process databases that were previously out of reach for data profiling.
The talk presents ongoing research with Johann Birnick, Thomas Bläsius, Tobias Friedrich, Julius Lischeid, Felix Naumann, Kitty Meeks, and Thorsten Papenbrock.
|| Controlling Assortativitiy in Heterogeneous Geometric Random Graphs
Hyperbolic random graphs have become a popular random graph model over the last decade, as they resemble many real-world graphs.
In particular, they have a heterogeneous degree distribution, high clustering and a small-world property.
Another property of graphs that distinguishes technical and biological from social networks is degree assortativity – a measure
that describes the correlation between degrees of adjacent nodes.
Most technical and biological graphs are found to have a significant negative assortativity, while the assortativity of social
networks is usually positive. However, hyperbolic random graphs offer no way to control the degree assortativity.
We explore and compare multiple ways to extend hyperbolic random graphs or the similar geometric inhomogeneous random graphs
so that the expected assortativity can be controlled while maintaining the properties that made hyperbolic random graphs attractive
in the first place. In particular, we describe a model with controllable assortativity that has high clustering, small-world and a
heterogeneous degree distribution. We also describe how such graphs can be generated efficiently.
||Master Project Summer 2019
|| From Symmetry to Asymmetry: Generalizing TSP Approximations by Parametrization
Asymmetric instances for the travelling salesman problem are practically relevant,
but still there is a big gap in approximation guarantees between symmetric and
asymmetric instances (assuming triangle inequality in either case).
We explore this apparent difference between symmetry and asymmetry by generalizing
the tree doubling and Christofides algorithm, the two most common approximations for TSP,
to parameterized approximations for ATSP.
The parameters we consider for the respective parameterizations are upper bounded by the
number of asymmetric distances in the given instance, which yields algorithms to
efficiently compute constant factor approximations also for moderately asymmetric TSP
As generalization of the Christofides algorithm, we derive a parameterized
2.5-approximation, where the parameter is the size of a vertex cover for the subgraph
induced by the asymmetric edges.
Our generalization of the tree doubling algorithm gives a parameterized 3-approximation,
where the parameter is the number of asymmetric edges in a given minimum spanning arborescence.
Both algorithms are also stated in the form of additive lossy kernelizations, which allows
to combine them with known polynomial time approximations for ATSP.
Further, we combine them with a notion of symmetry relaxation which allows to trade
approximation guarantee for runtime.
|| Robots, 3D Scene Understanding, and the Knowledge Graph: An Internship at... Google?
Earlier this year I (somewhat suddenly) realized that, maybe, I can't
be a PhD student forever. What should I do afterwards? The two most
prominent options are: staying in academia or going into industry.
Under the impression that I know what working at university is like, I
wanted to learn more about the second option. Therefore, I did an
internship as a software engineer at a large tech company.
In this lightweight presentation I talk about how I got in, as well as
the project that I worked on. This includes robots, how to answer
questions about indoor scenes programmatically, and how that is
related to Google's Knowledge Graph. Along the way I will share what
I learned about working in industry.
|| Convergence and Hardness of Strategic Schelling Segregation
The phenomenon of residential segregation was captured by Schelling's
famous segregation model where two types of agents are placed on a grid
and an agent is content with her location if the fraction of her neighbors
which have the same type as her is at least \(\tau\), for some
0 < \(\tau\) < 1 . Discontent agents simply swap their location with a
randomly chosen other discontent agent or jump to a random empty cell.
We analyze a generalized game-theoretic model of Schelling segregation
which allows more than two agent types and more general underlying graphs
modeling the residential area. For this we show that both aspects heavily
influence the dynamic properties and the tractability of finding an optimal
placement. We map the boundary of when improving response dynamics (IRD)
are guaranteed to converge and we prove several sharp threshold results
where guaranteed IRD convergence suddenly turns into a strong
non-convergence result: a violation of weak acyclicity. In particular, we
show threshold results also for Schelling's original model, which is in
contrast to the standard assumption in many empirical papers. In case of
convergence we show that IRD find equilibria quickly.
|| Graph and String Parameters: Connections Between Pathwidth, Cutwidth and the Locality Number
We investigate the locality number, a recently introduced structural parameter for strings,
and its connection to two important graph-parameters, cutwidth and pathwidth.
These connections allow to show that computing the locality number is NP-hard but fixed parameter
tractable (when the locality number or the alphabet size is treated as a parameter),
and can be approximated with a logarithmic ratio.
As a by-product, we also relate cutwidth via the locality number to pathwidth, which is of
independent interest, since it improves the currently best known approximation algorithm for cutwidth.
In this talk, I will tell you the tale of how this collection of essentially four simple polynomial-time
reductions got accepted to ICALP.
|| L Vertex Cover and L-U Subgraph Cover
Primarily, we consider a generalization of the famous Vertex Cover Problem.
Given are an edge lengths connected graph and a parameter L.
The task is to find a minimum set of vertices R, such that every connected subgraph with a total
length of at least L has a vertex from R.
The L Vertex Cover Problem was motivated to generate length restricted subgraphs for toll enforcement,
which is also a part of the seminar.
Summarized, we will see different solution methods for both problems. Especially for linear programming
approaches, we introduce general optimization methods, which are also applicable for different
Flow-Based Network Creation Games
While a lot of research considers the design of physical or logical networks as a centralized problem,
most real-world networks develop by decentralized, uncoordinated and often market-based processes
that are driven by independent, selfish agents. In 2003, Fabrikant et al. introduced their “Network Creation Game”,
which models the creation of Internet-like networks by selfish agents without a central coordination authority:
Node agents pay a fixed price for each edge they build towards other agents to reduce the cumulative hop-distance to the other agents in the resulting graph.
Ever since, a lot of research on similar models has been conducted, establishing a whole class of “Network Creation Games”.
To the best of our knowledge, all of them have in common, that the central metric for network quality is based on the distance to other agents,
which implies an optimized latency, which is an adequate approach for a lot of network use-cases.
Nevertheless, with the rising demand for cloud storage, high quality video streaming
and also in physical networks like the power grid, bandwidth/throughput is a crucial metric.
Based on the maximum flow problem – a concept first introduced by Harris & Ross as well as Dantzig in 1951 to measure the capacity of railway networks –
my master’s thesis presents a novel kind of Network Creation Game that regards the throughput between pairs of agents in a graph as the crucial metric for network quality.
The Current State of Schelling
Residential segregation is a wide-spread phenomenon that can be observed in almost every major city.
Schelling's famous agent-based model for residential segregation explains how such clusters can form even if all agents are tolerant,
i.e., if they agree to live in mixed neighborhoods.
Schelling is an exciting, ongoing topic in the game theoretic community and this year two working groups submitted some papers back and forth.
In this talk I would like to give an overview about this line of research with focus on our last paper "Topological Influence and Locality in Swap Schelling Games"
which put us into the lead. So the current paper score is 3 to 2 in our favor.
|| Optimal Kidney Exchange with Immunosuppressants
The deployment of centralized matching algorithms for efficient exchange of donated kidneys is a major success story of market design. In the standard kidney exchange problem, blood- or tissue-type incompatibility between a patient and a donor makes a transplant impossible. However, novel technological advances on potent immunosuppressant drugs can lift this barrier.
We present a general computational framework to study kidney exchange with immunosuppressants. In contrast to the standard kidney exchange problem, our problem also involves the decision about which patients get immunosuppressants to make them compatible with originally incompatible kidneys. Our main contribution is a set of general algorithms that provide flexibility in terms satisfying meaningful purposes.
Joint work with Haris Aziz.
|| Connected partition of graphs
I will talk about connected partitioning of graphs, where we are given a graph and want to partition the vertices into
k such that each part induces a connected subgraph and there are also some additional constraints on the
size/weight of each part. This problem has applications in the fields of road transport networks, multi-robot
coordination, power grid networks, image-processing, clustering etc. We look at a special case, when we know
that the input graph is k-connected. In this setting, we will look at a classical theorem from the 70's
called the Gyori-Lovasz theorem and a recent vertex-weighted generalization of it. I will give an outline
of a proof that we recently (in ICALP '18) gave for the vertex-weighted generalization. The Gyori-Lovasz
theorem and its generalization only give existential results (and also exponential algorithms) for connected
k-partition of k-connected graphs. It is an interesting open question to find efficient algorithms for the
same. I will speak about this and also some other related open problems.
|| Counting, Modular Counting and Symmetries
The notion of a homomorphism between graphs as a map between the associated vertices that
preserves edges is not only simple in its definition but also very powerful in its generality.
Many important problems and their counting-version can be encoded as homomorphisms to specific
target graphs, e.g. vertex-colouring and independent-set. Hence, it is less surprising that the
study of graph homomorphisms is a very active research topic both enjoying new results and
rediscoveries of old results providing new techniques and insights.
In this talk, we will discuss an instance of the latter in the form of a recent paper by Chen,
Curticapean and Dell from 2019. They study the problem of determining the number of homomorphisms
to a given target graph and rediscover the same dichotomy as Dyer and Greenhill in 2000.
Their approach to generalize the problem even more allows them to avoid a big case distinction and
provides additional insights into the problem class. After the discussion of their approach we will
transition to the problem of modular counting homomorphisms and to the question whether the approach
of Chen et al. can be applied. In particular, the modular version introduces an additional phenomenon
as symmetric substructures might get cancelled leading to a loss of structural properties. We will conclude
the talk by a short peak into the study of symmetries, an even older study full of history and big names.
|| Cautious Limit Learning
In the framework of language learning in the limit from text, a learner (a computable function)
is successively presented all and only the information of a target language (a computably enumerable
subset of the natural numbers). With every new datum, the learner makes a new guess which set it
believes to be presented. Learning is successful once the learner sticks to a single, correct
description of the target language (explanatory learning). Cautious learners additionally may
never output a hypothesis which is a proper subset of a previous guess.
Although being a seemingly natural way of learning, being cautious is known to severely lessen
the learning capabilities of explanatory (syntactic) learners. To further understand this loss
of learning power, previous work introduced weakened versions of cautious learning and gave first
partial results on their relation. In this talk, we provide the audience with a feeling for language
learning in the limit from text and discuss the completed picture for cautious learning restrictions.
||Stefan Neubert, Martin Schirneck