Friedrich, Tobias; Kötzing, Timo; Radhakrishnan, Aishwarya; Schiller, Leon; Schirneck, Martin; Tennigkeit, Georg; Wietheger, Simon Crossover for Cardinality Constrained OptimizationACM Transactions on Evolutionary Learning and Optimization 2023: 1–32
To understand better how and why crossover can benefit constrained optimization, we consider pseudo-Boolean functions with an upper bound \(B\) on the number of 1-bits allowed in the length-\(n\) bit string (i.e., a cardinality constraint). We investigate the natural translation of the OneMax test function to this setting, a linear function where \(B\) bits have a weight of \(1+ 1/n\) and the remaining bits have a weight of \(1\). Friedrich et al. [TCS 2020] gave a bound of \(\Theta(n^2)\) for the expected running time of the (1+1) EA on this function. Part of the difficulty when optimizing this problem lies in having to improve individuals meeting the cardinality constraint by flipping a \(1\) and a \(0\) simultaneously. The experimental literature proposes balanced operators, preserving the number of 1-bits, as a remedy. We show that a balanced mutation operator optimizes the problem in \(O(n \log n)\) if \(n-B = O(1)\). However, if \(n-B = \Theta(n)\), we show a bound of \(\Omega(n^2)\), just as for classic bit mutation. Crossover together with a simple island model gives running times of \(O(n^2 / \log n)\) (uniform crossover) and \(O(n\sqrt{n})\) (3-ary majority vote crossover). For balanced uniform crossover with Hamming-distance maximization for diversity we show a bound of \(O(n \log n)\). As an additional contribution, we present an extensive analysis of different balanced crossover operators from the literature.
Friedrich, Tobias; Gawendowicz, Hans; Lenzner, Pascal; Melnichenko, Anna Social Distancing Network CreationAlgorithmica 2023
During a pandemic people have to find a trade-off between meeting others and staying safely at home. While meeting others is pleasant, it also increases the risk of infection. We consider this dilemma by introducing a game-theoretic network creation model in which selfish agents can form bilateral connections. They benefit from network neighbors, but at the same time, they want to maximize their distance to all other agents. This models the inherent conflict that social distancing rules impose on the behavior of selfish agents in a social network. Besides addressing this familiar issue, our model can be seen as the inverse to the well-studied Network Creation Game by Fabrikant et al. (in: PODC 2003, pp 347–351, 2003. https://doi.org/10.1145/872035.872088), where agents aim at being as central as possible in the created network. We look at two variants of network creation governed by social distancing. Firstly, a variant without connection restrictions, where we characterize optimal and equilibrium networks, and derive asymptotically tight bounds on the Price of Anarchy and Price of Stability. The second variant allows connection restrictions. As our main result, we prove that Swap-Maximal Routing-Cost Spanning Trees, an efficiently computable weaker variant of Maximum Routing-Cost Spanning Trees, actually resemble equilibria for a significant range of the parameter space. Moreover, we give almost tight bounds on the Price of Anarchy and Price of Stability. These results imply that under social distancing the agents’ selfishness has a strong impact on the quality of the equilibria.
Bläsius, Thomas; Friedrich, Tobias; Krejca, Martin S.; Molitor, Louise The Impact of Geometry on Monochrome Regions in the Flip Schelling ProcessComputational Geometry (CGTA) 2023: 101902
Schelling's classical segregation model gives a coherent explanation for the wide-spread phenomenon of residential segregation. We introduce an agent-based saturated open-city variant, the Flip Schelling Process (FSP), in which agents, placed on a graph, have one out of two types and, based on the predominant type in their neighborhood, decide whether to change their types; similar to a new agent arriving as soon as another agent leaves the vertex. We investigate the probability that an edge u,v is monochrome, i.e., that both vertices u and v have the same type in the FSP, and we provide a general framework for analyzing the influence of the underlying graph topology on residential segregation. In particular, for two adjacent vertices, we show that a highly decisive common neighborhood, i.e., a common neighborhood where the absolute value of the difference between the number of vertices with different types is high, supports segregation and, moreover, that large common neighborhoods are more decisive. As an application, we study the expected behavior of the FSP on two common random graph models with and without geometry: (1) For random geometric graphs, we show that the existence of an edge u,v makes a highly decisive common neighborhood for u and v more likely. Based on this, we prove the existence of a constant c>0 such that the expected fraction of monochrome edges after the FSP is at least 1/2+c. (2) For Erdős–Rényi graphs we show that large common neighborhoods are unlikely and that the expected fraction of monochrome edges after the FSP is at most 1/2+o(1). Our results indicate that the cluster structure of the underlying graph has a significant impact on the obtained segregation strength.
Böther, Maximilian; Schiller, Leon; Fischbeck, Philipp; Molitor, Louise; Krejca, Martin S.; Friedrich, Tobias Evolutionary Minimization of Traffic CongestionIEEE Transactions on Evolutionary Computation 2023: 1809–1821
Traffic congestion is a major issue that can be solved by suggesting drivers alternative routes they are willing to take. This concept has been formalized as a strategic routing problem in which a single alternative route is suggested to an existing one. We extend this formalization and introduce the Multiple-Routes problem, which is given a start and destination and aims at finding up to \(n\) different routes that the drivers strategically disperse over, minimizing the overall travel time of the system. Due to the NP-hard nature of the problem, we introduce the Multiple-Routes evolutionary algorithm (MREA) as a heuristic solver. We study several mutation and crossover operators and evaluate them on real-world data of Berlin, Germany. We find that a combination of all operators yields the best result, reducing the overall travel time by a factor between \(1.8\) and \(3\), in the median, compared to all drivers taking the fastest route. For the base case \(n = 2\), we compare our MREA to the highly tailored optimal solver by Bläsius et al. (ATMOS 2020), and show that, in the median, our approach finds solutions of quality at least \(99.69 \%\) of an optimal solution while only requiring \(40 \%\) of the time.
Behrendt, Lukas; Casel, Katrin; Friedrich, Tobias; Lagodzinski, J. A. Gregor; Löser, Alexander; Wilhelm, Marcus From symmetry to asymmetry: Generalizing TSP approximations by parametrizationJournal of Computer and System Sciences 2023: 157–170
We generalize the tree doubling and Christofides algorithm to parameterized approximations for ATSP (constant factor approximations that invest more runtime with respect to a chosen parameter). The parameters we consider are upper bounded by the number of asymmetric distances, which yields algorithms to efficiently compute good approximations for moderately asymmetric TSP instances. As generalization of the Christofides algorithm, we derive a parameterized 2.5-approximation, with the size of a vertex cover for the subgraph induced by the edges with asymmetric distances as parameter. Our generalization of tree doubling gives a parameterized 3-approximation, where the parameter is the minimum number of asymmetric distances in a minimum spanning arborescence. Further, we combine these with a notion of symmetry relaxation which allows to trade approximation guarantee for runtime. Since the parameters we consider are theoretically incomparable, we present experimental results showing that generalized tree doubling frequently outperforms generalized Christofides with respect to parameter size.
Krämer, Bastian; Stang, Moritz; Doskoč, Vanja; Schäfers, Wolfgang; Friedrich, Tobias Automated valuation models: improving model performance by choosing the optimal spatial training levelJournal of Property Research 2023: 365–390
The academic community has discussed using Automated Valuation Models (AVMs) in the context of traditional real estate valuations and their performance for several decades. Most studies focus on finding the best method for estimating property values. One aspect that has not yet to be studied scientifically is the appropriate choice of the spatial training level. The published research on AVMs usually deals with a manually defined region and fails to test the methods used on different spatial levels. Our research aims to investigate the impact of training AVM algorithms at different spatial levels regarding valuation accuracy. We use a dataset with 1.2 million residential properties from Germany and test four methods: Ordinary Least Square, Generalised Additive Models, eXtreme Gradient Boosting and Deep Neural Network. Our results show that the right choice of spatial training level can significantly impact the model performance, and that this impact varies across the different methods.
Cohen, Sarel; Hershcovitch, Moshik; Taraz, Martin; Kißig, Otto; Issac, Davis; Wood, Andrew; Waddington, Daniel; Chin, Peter; Friedrich, Tobias Improved And Optimized Drug Repurposing For The SARS-CoV-2 PandemicPlos One 2023
The active global SARS-CoV-2 pandemic caused more than 426 million cases and 5.8 million deaths worldwide. The development of completely new drugs for such a novel disease is a challenging, time intensive process. Despite researchers around the world working on this task, no effective treatments have been developed yet. This emphasizes the importance of drug repurposing, where treatments are found among existing drugs that are meant for different diseases. A common approach to this is based on knowledge graphs, that condense relationships between entities like drugs, diseases and genes. Graph neural networks (GNNs) can then be used for the task at hand by predicting links in such knowledge graphs. Expanding on state-of-the-art GNN research, Doshi et al. recently developed the Dr-COVID model. We further extend their work using additional output interpretation strategies. The best aggregation strategy derives a top-100 ranking of 8,070 candidate drugs, 32 of which are currently being tested in COVID-19-related clinical trials. Moreover, we present an alternative application for the model, the generation of additional candidates based on a given pre-selection of drug candidates using collaborative filtering. In addition, we improved the implementation of the Dr-COVID model by significantly shortening the inference and pre-processing time by exploiting data-parallelism. As drug repurposing is a task that requires high computation and memory resources, we further accelerate the post-processing phase using a new emerging hardware - we propose a new approach to leverage the use of high-capacity Non-Volatile Memory for aggregate drug ranking.
Friedrich, Tobias; Göbel, Andreas; Krejca, Martin S.; Pappik, Marcus Polymer Dynamics via Cliques: New Conditions for ApproximationsTheoretical Computer Science 2023: 230–252
Abstract polymer models are systems of weighted objects, called polymers, equipped with an incompatibility relation. An important quantity associated with such models is the partition function, which is the weighted sum over all sets of compatible polymers. Various approximation problems reduce to approximating the partition function of a polymer model. Central to the existence of such approximation algorithms are weight conditions of the respective polymer model. Such conditions are derived either via complex analysis or via probabilistic arguments. We follow the latter path and establish a new condition—the clique dynamics condition—, which is less restrictive than the ones in the literature. We introduce a new Markov chain where the clique dynamics condition implies rapid mixing by utilizing cliques of incompatible polymers that naturally arise from the translation of algorithmic problems into polymer models. This leads to improved parameter ranges for several approximation algorithms, such as a factor of at least \(2^{1/\alpha}\) for the hard-core model on bipartite \(\alpha\)-expanders.
Bläsius, Thomas; Fischbeck, Philipp; Friedrich, Tobias; Katzmann, Maximilian Solving Vertex Cover in Polynomial Time on Hyperbolic Random GraphsTheory of Computing Systems 2023: 28–51
The computational complexity of the VertexCover problem has been studied extensively. Most notably, it is NP-complete to find an optimal solution and typically NP-hard to find an approximation with reasonable factors. In contrast, recent experiments suggest that on many real-world networks the run time to solve VertexCover is way smaller than even the best known FPT-approaches can explain. We link these observations to two properties that are observed in many real-world networks, namely a heterogeneous degree distribution and high clustering. To formalize these properties and explain the observed behavior, we analyze how a branch-and-reduce algorithm performs on hyperbolic random graphs, which have become increasingly popular for modeling real-world networks. In fact, we are able to show that the VertexCover problem on hyperbolic random graphs can be solved in polynomial time, with high probability. The proof relies on interesting structural properties of hyperbolic random graphs. Since these predictions of the model are interesting in their own right, we conducted experiments on real-world networks showing that these properties are also observed in practice.
Friedrich, Tobias; Lenzner, Pascal; Molitor, Louise; Seifert, Lars Single-Peaked Jump Schelling GamesInternational Symposium on Algorithmic Game Theory (SAGT) 2023
Schelling games model the wide-spread phenomenon of residential segregation in metropolitan areas from a game-theoretic point of view. In these games agents of different types each strategically select a node on a given graph that models the residential area to maximize their individual utility. The latter solely depends on the types of the agents on neighboring nodes and it has been a standard assumption to consider utility functions that are monotone in the number of same-type neighbors. This simplifying assumption has recently been challenged since sociological poll results suggest that real-world agents actually favor diverse neighborhoods. We contribute to the recent endeavor of investigating residential segregation models with realistic agent behavior by studying Jump Schelling Games with agents having a single-peaked utility function. In such games, there are empty nodes in the graph and agents can strategically jump to such nodes to improve their utility. We investigate the existence of equilibria and show that they exist under specific conditions. Contrasting this, we prove that even on simple topologies like paths or rings such stable states are not guaranteed to exist. Regarding the game dynamics, we show that improving response cycles exist independently of the position of the peak in the utility function. Moreover, we show high almost tight bounds on the Price of Anarchy and the Price of Stability with respect to the recently proposed degree of integration, which counts the number of agents with a diverse neighborhood and which serves as a proxy for measuring the segregation strength. Last but not least, we show that computing a beneficial state with high integration is NP-complete and, as a novel conceptual contribution, we also show that it is NP-hard to decide if an equilibrium state can be found via improving response dynamics starting from a given initial state.
Quinzan, Francesco; Khanna, Rajiv; Hershcovitch, Moshik; Cohen, Sarel; Waddington, Daniel G.; Friedrich, Tobias; Mahoney, Michael W. Fast Feature Selection with Fairness ConstraintsArtificial Intelligence and Statistics (AISTATS) 2023: 7800–7823
We study the fundamental problem of selecting optimal features for model construction. This problem is computationally challenging on large datasets, even with the use of greedy algorithm variants. To address this challenge, we extend the adaptive query model, recently proposed for the greedy forward selection for submodular functions, to the faster paradigm of Orthogonal Matching Pursuit for non-submodular functions. The proposed algorithm achieves exponentially fast parallel run time in the adaptive query model, scaling much better than prior work. Furthermore, our extension allows the use of downward-closed constraints, which can be used to encode certain fairness criteria into the feature selection process. We prove strong approximation guarantees for the algorithm based on standard assumptions. These guarantees are applicable to many parametric models, including Generalized Linear Models. Finally, we demonstrate empirically that the proposed algorithm competes favorably with state-of-the-art techniques for feature selection, on real-world and synthetic datasets.
Friedrich, Tobias; Lenzner, Pascal; Molitor, Louise; Seifert, Lars Single-Peaked Jump Schelling GamesAutonomous Agents and Multiagent Systems (AAMAS) 2023: 2899–2901
Schelling games model the wide-spread phenomenon of residential segregation in metropolitan areas from a game-theoretic point of view. In these games agents of different types each strategically select a node on a given graph that models the residential area to maximize their individual utility. The latter solely depends on the types of the agents on neighboring nodes and it has been a standard assumption to consider utility functions that are monotone in the number of same-type neighbors. This simplifying assumption has recently been challenged since sociological poll results suggest that real-world agents actually favor diverse neighborhoods. We contribute to the recent endeavor of investigating residential segregation models with realistic agent behavior by studying Jump Schelling Games with agents having a single-peaked utility function. In such games, there are empty nodes in the graph and agents can strategically jump to such nodes to improve their utility. We investigate the existence of equilibria and show that they exist under specific conditions. Contrasting this, we prove that even on simple topologies like paths or rings such stable states are not guaranteed to exist. Regarding the game dynamics, we show that improving response cycles exist independently of the position of the peak in the utility function. Moreover, we show high almost tight bounds on the Price of Anarchy and the Price of Stability with respect to the recently proposed degree of integration, which counts the number of agents with a diverse neighborhood and which serves as a proxy for measuring the segregation strength. Last but not least, we show that computing a beneficial state with high integration is NP-complete and, as a novel conceptual contribution, we also show that it is NP-hard to decide if an equilibrium state can be found via improving response dynamics starting from a given initial state.
Khomutovskiy, Ivan; Dunker, Rebekka; Dierking, Jessica; Egbert, Julian; Helms, Christian; Schöllkopf, Finn; Casel, Katrin; Fischbeck, Philipp; Friedrich, Tobias; Isaac, Davis; Krogmann, Simon; Lenzner, Pascal Applying Skeletons to Speed Up the Arc-Flags Routing AlgorithmSIAM Symposium on Algorithm Engineering and Experiments (ALENEX) 2023: 110–122
The Single-Source Shortest Path problem is classically solved by applying Dijkstra's algorithm. However, the plain version of this algorithm is far too slow for real-world applications such as routing in large road networks. To amend this, many speed-up techniques have been developed that build on the idea of computing auxiliary data in a preprocessing phase, that is used to speed up the queries. One well-known example is the Arc-Flags algorithm that is based on the idea of precomputing edge flags to make the search more goal-directed. To explain the strong practical performance of such speed-up techniques, several graph parameters have been introduced. The skeleton dimension is one such parameter that has already been used to derive runtime bounds for some speed-up techniques. Moreover, it was experimentally shown to be low in real-world road networks. We introduce a method to incorporate skeletons, the underlying structure behind the skeleton dimension, to improve routing speed-up techniques even further. As a proof of concept, we develop new algorithms called SKARF and SKARF+ that combine skeletons with Arc-Flags, and demonstrate via extensive experiments on large real-world road networks that SKARF+ yields a significant reduction of the search space and the query time of about 30% to 40% over Arc-Flags. We also prove theoretical bounds on the query time of SKARF, which is the first time an Arc-Flags variant has been analyzed in terms of skeleton dimension.
Krämer, Bastian; Stang, Moritz; Doskoč, Vanja; Schäfers, Wolfgang; Friedrich, Tobias Automated Valuation Models: Improving Model Performance by Choosing the Optimal Spatial Training LevelAmerican Real Estate Society (ARES) 2023: 1–26
The use of Automated Valuation Models (AVMs) in the context of traditional real estate valuations and their performance has been discussed in the academic community for several decades. Most studies focus on finding which method is best suited for estimating property values. One aspect that has not yet been studied scientifically is the appropriate choice of the spatial training level. The published research on AVMs usually deals with a manually defined region and fails to test the methods used on different spatial levels. The aim of our research is thus to investigate the impact of training AVM algorithms at different spatial levels in terms of valuation accuracy. We use a dataset with about 1.2 million residential properties from Germany and test four different methods, namely Ordinary Least Square, Generalized Additive Models, eXtreme Gradient Boosting and Deep Neural Network. Our results show that the right choice of spatial training level can have a major impact on the model performance, and that this impact varies across the different methods.
Bläsius, Thomas; Friedrich, Tobias; Katzmann, Maximilian; Ruff, Janosch; Zeif, Ziena On the Giant Component of Geometric Inhomogeneous Random GraphsEuropean Symposium on Algorithms (ESA) 2023: 20:1–20:13
In this paper we study the threshold model of geometric inhomogeneous random graphs (GIRGs); a generative random graph model that is closely related to hyperbolic random graphs (HRGs). These models have been observed to capture complex real-world networks well with respect to the structural and algorithmic properties. Following comprehensive studies regarding their connectivity, i.e., which parts of the graphs are connected, we have a good understanding under which circumstances a giant component (containing a constant fraction of the graph) emerges. While previous results are rather technical and challenging to work with, the goal of this paper is to provide more accessible proofs. At the same time we significantly improve the previously known probabilistic guarantees, showing that GIRGs contain a giant component with probability \( 1 - exp(-Omega (n^(3-\tau )/2})) \) for graph size \(n\) and a degree distribution with power-law exponent \(tau in (2, 3)\). Based on that we additionally derive insights about the connectivity of certain induced subgraphs of GIRGs.
Casel, Katrin; Friedrich, Tobias; Schirneck, Martin; Wietheger, Simon Fair Correlation Clustering in ForestsFoundations of Responsible Computing (FORC) 2023: 9:1–9:12
The study of algorithmic fairness received growing attention recently. This stems from the awareness that bias in the input data for machine learning systems may result in discriminatory outputs. For clustering tasks, one of the most central notions of fairness is the formalization by Chierichetti, Kumar, Lattanzi, and Vassilvitskii [NeurIPS 2017]. A clustering is said to be fair, if each cluster has the same distribution of manifestations of a sensitive attribute as the whole input set. This is motivated by various applications where the objects to be clustered have sensitive attributes that should not be over- or underrepresented. Most research on this version of fair clustering has focused on centriod-based objectives. In contrast, we discuss the applicability of this fairness notion to Correlation Clustering. The existing literature on the resulting Fair Correlation Clustering problem either presents approximation algorithms with poor approximation guarantees or severely limits the possible distributions of the sensitive attribute (often only two manifestations with a 1:1 ratio are considered). Our goal is to understand if there is hope for better results in between these two extremes. To this end, we consider restricted graph classes which allow us to characterize the distributions of sensitive attributes for which this form of fairness is tractable from a complexity point of view. While existing work on Fair Correlation Clustering gives approximation algorithms, we focus on exact solutions and investigate whether there are efficiently solvable instances. The unfair version of Correlation Clustering is trivial on forests, but adding fairness creates a surprisingly rich picture of complexities. We give an overview of the distributions and types of forests where Fair Correlation Clustering turns from tractable to intractable. As the most surprising insight, we consider the fact that the cause of the hardness of Fair Correlation Clustering is not the strictness of the fairness condition. We lift most of our results to also hold for the relaxed version of the fairness condition. Instead, the source of hardness seems to be the distribution of the sensitive attribute. On the positive side, we identify some reasonable distributions that are indeed tractable. While this tractability is only shown for forests, it may open an avenue to design reasonable approximations for larger graph classes.
Nikfarjam, Adel; Rothenberger, Ralf; Neumann, Frank; Friedrich, Tobias Evolutionary Diversity Optimisation in Constructing Satisfying AssignmentsGenetic and Evolutionary Computation Conference (GECCO) 2023
Computing diverse solutions for a given problem, in particular evolutionary diversity optimisation (EDO), is a hot research topic in the evolutionary computation community. This paper studies the Boolean satisfiability problem (SAT) in the context of EDO. SAT is an of great importance in computer science and differs from the other problems that have been studied in EDO literature, such as KP and TSP. 1) it is possible to add more constraints (clauses) to the problem so as to forbid solutions or fix variables. 2) There can be found several powerful solvers in the literature such as minisat; we utilise such a solver to construct a diverse set of solutions. Moreover, maximising diversity provides us with invaluable information about the solution space of a given SAT problem, such as how large is the feasible region. In this study, we introduce evolutionary algorithms (EAs) employing a well-known SAT solver to maximise diversity among a set of SAT solutions explicitly. The experimental investigations indicate the introduced algorithms' capability to maximise diversity among the SAT solutions.
Baguley, Samuel; Friedrich, Tobias; Neumann, Aneta; Neumann, Frank; Pappik, Marcus; Zeif, Ziena Fixed Parameter Multi-Objective Evolutionary Algorithms for the W-Separator ProblemGenetic and Evolutionary Computation Conference (GECCO) 2023
Parameterized analysis provides powerful mechanisms for obtaining fine-grained insights into different types of algorithms. In this work, we combine this field with evolutionary algorithms and provide parameterized complexity analysis of evolutionary multi-objective algorithms for the W-separator problem, which is a natural generalization of the vertex cover problem. The goal is to remove the minimum number of vertices such that each connected component in the resulting graph has at most W vertices. We provide different multi-objective formulations involving two or three objectives that provably lead to fixed-parameter evolutionary algorithms with respect to the value of an optimal solution OPT and W. Of particular interest are kernelizations and the reducible structures used for them. We show that in expectation the algorithms make incremental progress in finding such structures and beyond. The current best known kernelization of the W-separator uses linear programming methods and requires a non-trivial post-process to extract the reducible structures. We provide additional structural features to show that evolutionary algorithms with appropriate objectives are also capable of extracting them. Our results show that evolutionary algorithms with different objectives guide the search and admit fixed parameterized runtimes to solve or approximate (even arbitrarily close) the W-separator problem.
Friedrich, Tobias; Kötzing, Timo; Neumann, Aneta; Neumann, Frank; Radhakrishnan, Aishwarya Analysis of the (1+1) EA on LeadingOnes with ConstraintsGenetic and Evolutionary Computation Conference (GECCO ’23) 2023
Understanding how evolutionary algorithms perform on constrained problems has gained increasing attention in recent years. In this paper, we study how evolutionary algorithms optimize constrained versions of the classical LeadingOnes problem. We first provide a run time analysis for the classical (1+1)~EA on the LeadingOnes problem with a deterministic cardinality constraint, giving \($\Theta(n (n-B)\log(B) + n^2)$\) as the tight bound. Our results show that the behaviour of the algorithm is highly dependent on the constraint bound of the uniform constraint. Afterwards, we consider the problem in the context of stochastic constraints and provide insights using experimental studies on how the (\($\mu$\)+1)~EA is able to deal with these constraints in a sampling-based setting.
Bilò, Davide; Choudhary, Keerti; Cohen, Sarel; Friedrich, Tobias; Krogmann, Simon; Schirneck, Martin Fault-Tolerant ST-Diameter OraclesInternational Colloquium on Automata, Languages and Programming (ICALP) 2023: 24:1–24:20
We study the problem of estimating the \(ST\)-diameter of a graph that is subject to a bounded number of edge failures. An \(f\)-edge fault-tolerant \(ST\)-diameter oracle (\(f\)-FDO-\(ST\)) is a data structure that preprocesses a given graph \(G\), two sets of vertices \(S,T\), and positive integer \(f\). When queried with a set \(F\) of at most \(f\) edges, the oracle returns an estimate \(\widehat{D}\) of the \(ST\)-diameter \(\mathrm{diam(G-F,S,T)\), the maximum distance between vertices in \(S\) and \(T\) in \(G-F\). The oracle has stretch \(\sigma \geq 1\) if \(\mathrm{diam(G-F,S,T) leq \widehat{D leq sigma \mathrm{diam(G-F,S,T)\). If \(S\) and \(T\) both contain all vertices, the data structure is called an \(f\)-edge fault-tolerant diameter oracle (\(f\)-FDO). An \(f\)-edge fault-tolerant distance sensitivity oracles (\(f\)-DSO) estimates the pairwise graph distances under up to \(f\) failures. We design new \(f\)-FDOs and \(f\)-FDO-\(ST\)s by reducing their construction to that of all-pairs and single-source \(f\)-DSOs. We obtain several new tradeoffs between the size of the data structure, stretch guarantee, query and preprocessing times for diameter oracles by combining our black-box reductions with known results from the literature. We also provide an information-theoretic lower bound on the space requirement of approximate \(f\)-FDOs. We show that there exists a family of graphs for which any \(f\)-FDO with sensitivity \(f \ge 2\) and stretch less than \(5/3\) requires \(\Omega(n^{3/2})\) bits of space, regardless of the query time.
Friedrich, Tobias; Göbel, Andreas; Katzmann, Maximilian; Schiller, Leon Cliques in High-Dimensional Geometric Inhomogeneous Random GraphsInternational Colloquium on Automata, Languages and Programming (ICALP) 2023: 62:1–62:13
A recent trend in the context of graph theory is to bring theoretical analyses closer to empirical observations, by focusing the studies on random graph models that are used to represent practical instances. There, it was observed that geometric inhomogeneous random graphs (GIRGs) yield good representations of complex real-world networks, by expressing edge probabilities as a function that depends on (heterogeneous) vertex weights and distances in some underlying geometric space that the vertices are distributed in. While most of the parameters of the model are understood well, it was unclear how the dimensionality of the ground space affects the structure of the graphs. In this paper, we complement existing research into the dimension of geometric random graph models and the ongoing study of determining the dimensionality of real-world networks, by studying how the structure of GIRGs changes as the number of dimensions increases. We prove that, in the limit, GIRGs approach non-geometric inhomogeneous random graphs and present insights on how quickly the decay of the geometry impacts important graph structures. In particular, we study the expected number of cliques of a given size as well as the clique number and characterize phase transitions at which their behavior changes fundamentally. Finally, our insights help in better understanding previous results about the impact of the dimensionality on geometric random graphs.
Bilò, Davide; Cohen, Sarel; Friedrich, Tobias; Gawendowicz, Hans; Klodt, Nicolas; Lenzner, Pascal; Skretas, George Temporal Network Creation GamesInternational Joint Conference on Artificial Intelligence (IJCAI) 2023: 2511–2519
Most networks are not static objects, but instead they change over time. This observation has sparked rigorous research on temporal graphs within the last years. In temporal graphs, we have a fixed set of nodes and the connections between them are only available at certain time steps. This gives rise to a plethora of algorithmic problems on such graphs, most prominently the problem of finding temporal spanners, i.e., the computation of subgraphs that guarantee all pairs reachability via temporal paths. To the best of our knowledge, only centralized approaches for the solution of this problem are known. However, many real-world networks are not shaped by a central designer but instead they emerge and evolve by the interaction of many strategic agents. This observation is the driving force of the recent intensive research on game-theoretic network formation models. In this work we bring together these two recent research directions: temporal graphs and game-theoretic network formation. As a first step into this new realm, we focus on a simplified setting where a complete temporal host graph is given and the agents, corresponding to its nodes, selfishly create incident edges to ensure that they can reach all other nodes via temporal paths in the created network. This yields temporal spanners as equilibria of our game. We prove results on the convergence to and the existence of equilibrium networks, on the complexity of finding best agent strategies, and on the quality of the equilibria. By taking these first important steps, we uncover challenging open problems that call for an in-depth exploration of the creation of temporal graphs by strategic agents.
Cohen, Sarel; Fischbeck, Philipp; Friedrich, Tobias; Krejca, Martin S. The Common-Neighbors Metric is Noise-Robust and Reveals Substructures of Real-World NetworksPacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD) 2023: 67–79
Real-world networks typically display a complex structure that is hard to explain by a single model. A common approach is to partition the edges of the network into disjoint simpler structures. An important property in this context is locality—incident vertices usually have many common neighbors. This allows to classify edges into two groups, based on the number of the common neighbors of their incident vertices. Formally, this is captured by the common-neighbors (CN) metric, which forms the basis of many metrics for detecting outlier edges. Such outliers can be interpreted as noise or as a substructure. We aim to understand how useful the metric is, and empirically analyze several scenarios. We randomly insert outlier edges into real-world and generated graphs with high locality, and measure the metric accuracy for partitioning the combined edges. In addition, we use the metric to decompose real-world networks, and measure properties of the partitions. Our results show that the CN metric is a very good classifier that can reliably detect noise up to extreme levels (\(83\%\) noisy edges). We also provide mathematically rigorous analyses on special random-graph models. Last, we find the CN metric consistently decomposes real-world networks into two graphs with very different structures.
Friedrich, Tobias; Gawendowicz, Hans; Lenzner, Pascal; Zahn, Arthur The Impact of Cooperation in Bilateral Network CreationACM Symposium on Principles of Distributed Computing (PODC) 2023
Many real-world networks, like the Internet or social networks, are not the result of central design but instead the outcome of the interaction of local agents that selfishly optimize their individual utility. The well-known Network Creation Game by Fabrikant, Luthra, Maneva, Papadimitriou, and Shenker [PODC 2003] models this. There, agents corresponding to network nodes buy incident edges towards other agents for a price of \(\alpha > 0\) and simultaneously try to minimize their buying cost and their total hop distance. Since in many real-world networks, e.g., social networks, consent from both sides is required to establish and maintain a connection, Corbo and Parkes [PODC 2005] proposed a bilateral version of the Network Creation Game, in which mutual consent and payment are required in order to create edges. It is known that this cooperative version has a significantly higher Price of Anarchy compared to the unilateral version. On the first glance this is counter-intuitive, since cooperation should help to avoid socially bad states. However, in the bilateral version only a very restrictive form of cooperation is considered. We investigate this trade-off between the amount of cooperation and the Price of Anarchy by analyzing the bilateral version with respect to various degrees of cooperation among the agents. With this, we provide insights into what kind of cooperation is needed to ensure that socially good networks are created. As a first step in this direction, we focus on tree networks and present a collection of asymptotically tight bounds on the Price of Anarchy that precisely map the impact of cooperation. Most strikingly, we find that weak forms of cooperation already yield a significantly improved Price of Anarchy. In particular, the cooperation of coalitions of size 3 is enough to achieve constant bounds. Moreover, for general networks we show that enhanced cooperation yields close to optimal networks for a wide range of edge prices. Along the way, we disprove an old conjecture by Corbo and Parkes [PODC 2005].
Anand, Konrad; Göbel, Andreas; Pappik, Marcus; Perkins, Will Perfect Sampling for Hard Spheres from Strong Spatial MixingInternational Conference on Randomization and Computation (Random) 2023: 38:1–38:18
We provide a perfect sampling algorithm for the hard-sphere model on subsets of ℝd with expected running time linear in the volume under the assumption of strong spatial mixing. A large number of perfect and approximate sampling algorithms have been devised to sample from the hard-sphere model, and our perfect sampling algorithm is efficient for a range of parameters for which only efficient approximate samplers were previously known and is faster than these known approximate approaches. Our methods also extend to the more general setting of Gibbs point processes interacting via finite-range, repulsive potentials.
Angrick, Sebastian; Bals, Ben; Casel, Katrin; Cohen, Sarel; Friedrich, Tobias; Hastrich, Niko; Hradilak, Theresa; Issac, Davis; Kißig, Otto; Schmidt, Jonas; Wendt, Leo Solving Directed Feedback Vertex Set by Iterative Reduction to Vertex CoverSymposium on Experimental Algorithms (SEA) 2023: 10:1–10:14
In the Directed Feedback Vertex Set (DFVS) problem, one is given a directed graph \(G = (V,E)\) and wants to find a minimum cardinality set \(S \subseteq V\) such that \(G-S\) is acyclic. DFVS is a fundamental problem in computer science and finds applications in areas such as deadlock detection. The problem was the subject of the 2022 PACE coding challenge. We develop a novel exact algorithm for the problem that is tailored to perform well on instances that are mostly bi-directed. For such instances, we adapt techniques from the well-researched vertex cover problem. Our core idea is an iterative reduction to vertex cover. To this end, we also develop a new reduction rule that reduces the number of not bi-directed edges. With the resulting algorithm, we were able to win third place in the exact track of the PACE challenge. We perform computational experiments and compare the running time to other exact algorithms, in particular to the winning algorithm in PACE. Our experiments show that we outpace the other algorithms on instances that have a low density of uni-directed edges.
Bläsius, Thomas; Friedrich, Tobias; Katzmann, Maximilian; Stephan, Daniel Strongly Hyperbolic Unit Disk GraphsSymposium Theoretical Aspects of Computer Science (STACS) 2023: 13:1–13:17
The class of Euclidean unit disk graphs is one of the most fundamental and well-studied graph classes with underlying geometry. In this paper, we identify this class as a special case in the broader class of hyperbolic unit disk graphs and introduce strongly hyperbolic unit disk graphs as a natural counterpart to the Euclidean variant. In contrast to the grid-like structures exhibited by Euclidean unit disk graphs, strongly hyperbolic networks feature hierarchical structures, which are also observed in complex real-world networks. We investigate basic properties of strongly hyperbolic unit disk graphs, including adjacencies and the formation of cliques, and utilize the derived insights to demonstrate that the class is useful for the development and analysis of graph algorithms. Specifically, we develop a simple greedy routing scheme and analyze its performance on strongly hyperbolic unit disk graphs in order to prove that routing can be performed more efficiently on such networks than in general.
Friedrich, Tobias; Issac, Davis; Kumar, Nikhil; Mallek, Nadym; Zeif, Ziena Approximate Max-Flow Min-Multicut Theorem for Graphs of Bounded TreewidthSymposium Theory of Computing (STOC) 2023: 1325–1334
We prove an approximate max-multiflow min-multicut theorem for bounded treewidth graphs. In particular, we show the following: Given a treewidth-\(r\) graph, there exists a (fractional) multicommodity flow of value \(f\), and a multicut of capacity \(c\) such that \(f leq c leq \mathcal{O(\ln(r + 1)) \cdot f \) . It is well known that the multiflow-multicut gap on an \(r\)-vertex (constant degree) expander graph can be \(\Omega(ln r)\), and hence our result is tight up to constant factors. Our proof is constructive, and we also obtain a polynomial time \( \mathcal{O(ln(r + 1))\)-approximation algorithm for the minimum multicut problem on treewidth-r graphs. Our algorithm proceeds by rounding the optimal fractional solution to the natural linear programming relaxation of the multicut problem. We introduce novel modifications to the well-known region growing algorithm to facilitate the rounding while guaranteeing at most a logarithmic factor loss in the treewidth.
Bilò, Davide; Chechik, Shiri; Choudhary, Keerti; Cohen, Sarel; Friedrich, Tobias; Krogmann, Simon; Schirneck, Martin Approximate Distance Sensitivity Oracles in Subquadratic SpaceSymposium on Theory of Computing (STOC) 2023: 1396–1409
An \(f\)-edge fault-tolerant distance sensitive oracle (\(f\)-DSO) with stretch \(\sigma\ge1\) is a data structure that preprocesses a given undirected, unweighted graph \(G\) with \(n\) vertices and \(m\) edges, and a positive integer \(f\). When queried with a pair of vertices \(s,t\) and a set \(F\) of at most \(f\) edges, it returns a \(\sigma\)-approximation of the \(s\)-\(t\)-distance in \(G-F\). We study \(f\)-DSOs that take subquadratic space. Thorup and Zwick [JACM 2015] showed that this is only possible for \(\sigma\geq3\). We present, for any constant \(f\geq1\) and \(\alpha\in(0,\frac{1}{2})\), and any \(\varepsilon>0\), an \(f\)-DSO with stretch \(3+\varepsilon\) that takes \(\tilde{O}(n^{2-\frac{\alpha}{f+1}}/\varepsilon)\cdot{}O(\log n/\varepsilon)^{f+1}\) space and has an \(O(n^\alpha/\varepsilon^2)\) query time. We also give an improved construction for graphs with diameter at most \(D\). For any constant \(k\), we devise an \(f\)-DSO with stretch \(2k-1\) that takes \(O(D^{f+o(1)}n^{1+1/k})\) space and has \(\tilde{O}(D^o(1)})\) query time, with a preprocessing time of \(O(D^{f+o(1)}mn^{1/k})\). Chechik, Cohen, Fiat, and Kaplan [SODA 2017] presented an \(f\)-DSO with stretch \(1{+}\varepsilon\) and preprocessing time \(\tilde{O}_\varepsilon(n^5)\), albeit with a super-quadratic space requirement. We show how to reduce their preprocessing time to \(O(mn^{2})\cdot{}O(\log n/\varepsilon)^{f}\).
Bilò, Davide; Choudhary, Keerti; Cohen, Sarel; Friedrich, Tobias; Krogmann, Simon; Schirneck, Martin Compact Distance Oracles with Large Sensitivity and Low StretchAlgorithms and Data Structures Symposium (WADS) 2023: 149–163
An \(f\)-edge fault-tolerant distance sensitive oracle (\(f\)-DSO) with stretch \(\sigma \geq 1\) is a data-structure that preprocesses an input graph \(G = (V,E)\). When queried with the triple \((s,t,F)\), where \(s, t \in V\) and \(F \subseteq E\) contains at most \(f\) edges of \(G\), the oracle returns an estimate \(\widehat{d_G-F(s,t)\) of the distance \(d_{G-F}(s,t)\) between \(s\) and \(t\) in the graph \(G-F\) such that \(d_{G-F}(s,t) leq \widehat{d_G-F(s,t) leq sigma cdot d_G-F(s,t)\). For any positive integer \(k \ge 2\) and any \(0 < \alpha < 1\), we present an \(f\)-DSO with sensitivity \(f = o(\log n/\log\log n)\), stretch \(2k-1\), space \(O(n^{1+\frac{1}{k+\alpha+o(1)})\), and an \(\widetildeO(n^1+\frac{1}{k - \frac{\alpha}{k(f+1)}})\) query time. Prior to our work, there were only three known \(f\)-DSOs with subquadratic space. The first one by Chechik et al. [Algorithmica 2012] has a stretch of \((8k-2)(f+1)\), depending on \(f\). Another approach is storing an \(f\)-edge fault-tolerant \((2k-1)\)-spanner of \(G\). The bottleneck is the large query time due to the size of any such spanner, which is \(\Omega(n^{1+1/k})\) under the Erdős girth conjecture. Bilò et al. [STOC 2023] gave a solution with stretch \(3+\varepsilon\), query time \(O(n^{\alpha})\) but space \(O(n^{2-\frac{\alpha}{f+1}})\), approaching the quadratic barrier for large sensitivity. In the realm of subquadratic space, our \(f\)-DSOs are the first ones that guarantee, at the same time, large sensitivity, low stretch, and non-trivial query time. To obtain our results, we use the approximate distance oracles of Thorup and Zwick [JACM 2005], and the derandomization of the \(f\)-DSO of Weimann and Yuster [TALG 2013] that was recently given by Karthik and Parter [SODA 2021].
Casel, Katrin; Friedrich, Tobias; Issac, Davis; Niklanovits, Aikaterini; Zeif, Ziena Efficient Constructions for the Gyori-Lovasz Theorem on Almost Chordal GraphsWorkshop Graph-Theoretic Concepts in Computer Science (WG) 2023: 143–156
In the 1970s, GyHo}ri and Lovász showed that for a \(k\)-connected \(n\)-vertex graph, a given set of terminal vertices \(t_1, \dots, t_k\) and natural numbers \(n_1, \dots, n_k\) satisfying \(\sum_{i=1^k n_i = n\), a connected vertex partition \(S_1, \dots, S_k\) satisfying \(t_i \in S_i\) and \(|S_i| = n_i\) exists. However, polynomial algorithms to actually compute such partitions are known so far only for \(k \leq 4\). This motivates us to take a new approach and constrain this problem to particular graph classes instead of restricting the values of \(k\). More precisely, we consider \(k\)-connected chordal graphs and a broader class of graphs related to them. For the first class, we give an algorithm with \(\mathcal{O(n^2)\) running time that solves the problem exactly, and for the second, an algorithm with \(\mathcal{O(n^4)\) running time that deviates on at most one vertex from the required vertex partition sizes.