Friedrich, Tobias; Kötzing, Timo; Radhakrishnan, Aishwarya; Schiller, Leon; Schirneck, Martin; Tennigkeit, Georg; Wietheger, Simon Crossover for Cardinality Constrained OptimizationACM Transactions on Evolutionary Learning and Optimization 2023: 1–32
To understand better how and why crossover can benefit constrained optimization, we consider pseudo-Boolean functions with an upper bound \(B\) on the number of 1-bits allowed in the length-\(n\) bit string (i.e., a cardinality constraint). We investigate the natural translation of the OneMax test function to this setting, a linear function where \(B\) bits have a weight of \(1+ 1/n\) and the remaining bits have a weight of \(1\). Friedrich et al. [TCS 2020] gave a bound of \(\Theta(n^2)\) for the expected running time of the (1+1) EA on this function. Part of the difficulty when optimizing this problem lies in having to improve individuals meeting the cardinality constraint by flipping a \(1\) and a \(0\) simultaneously. The experimental literature proposes balanced operators, preserving the number of 1-bits, as a remedy. We show that a balanced mutation operator optimizes the problem in \(O(n \log n)\) if \(n-B = O(1)\). However, if \(n-B = \Theta(n)\), we show a bound of \(\Omega(n^2)\), just as for classic bit mutation. Crossover together with a simple island model gives running times of \(O(n^2 / \log n)\) (uniform crossover) and \(O(n\sqrt{n})\) (3-ary majority vote crossover). For balanced uniform crossover with Hamming-distance maximization for diversity we show a bound of \(O(n \log n)\). As an additional contribution, we present an extensive analysis of different balanced crossover operators from the literature.
Casel, Katrin; Friedrich, Tobias; Schirneck, Martin; Wietheger, Simon Fair Correlation Clustering in ForestsFoundations of Responsible Computing (FORC) 2023: 9:1–9:12
The study of algorithmic fairness received growing attention recently. This stems from the awareness that bias in the input data for machine learning systems may result in discriminatory outputs. For clustering tasks, one of the most central notions of fairness is the formalization by Chierichetti, Kumar, Lattanzi, and Vassilvitskii [NeurIPS 2017]. A clustering is said to be fair, if each cluster has the same distribution of manifestations of a sensitive attribute as the whole input set. This is motivated by various applications where the objects to be clustered have sensitive attributes that should not be over- or underrepresented. Most research on this version of fair clustering has focused on centriod-based objectives. In contrast, we discuss the applicability of this fairness notion to Correlation Clustering. The existing literature on the resulting Fair Correlation Clustering problem either presents approximation algorithms with poor approximation guarantees or severely limits the possible distributions of the sensitive attribute (often only two manifestations with a 1:1 ratio are considered). Our goal is to understand if there is hope for better results in between these two extremes. To this end, we consider restricted graph classes which allow us to characterize the distributions of sensitive attributes for which this form of fairness is tractable from a complexity point of view. While existing work on Fair Correlation Clustering gives approximation algorithms, we focus on exact solutions and investigate whether there are efficiently solvable instances. The unfair version of Correlation Clustering is trivial on forests, but adding fairness creates a surprisingly rich picture of complexities. We give an overview of the distributions and types of forests where Fair Correlation Clustering turns from tractable to intractable. As the most surprising insight, we consider the fact that the cause of the hardness of Fair Correlation Clustering is not the strictness of the fairness condition. We lift most of our results to also hold for the relaxed version of the fairness condition. Instead, the source of hardness seems to be the distribution of the sensitive attribute. On the positive side, we identify some reasonable distributions that are indeed tractable. While this tractability is only shown for forests, it may open an avenue to design reasonable approximations for larger graph classes.
Bilò, Davide; Choudhary, Keerti; Cohen, Sarel; Friedrich, Tobias; Krogmann, Simon; Schirneck, Martin Fault-Tolerant ST-Diameter OraclesInternational Colloquium on Automata, Languages and Programming (ICALP) 2023: 24:1–24:20
We study the problem of estimating the \(ST\)-diameter of a graph that is subject to a bounded number of edge failures. An \(f\)-edge fault-tolerant \(ST\)-diameter oracle (\(f\)-FDO-\(ST\)) is a data structure that preprocesses a given graph \(G\), two sets of vertices \(S,T\), and positive integer \(f\). When queried with a set \(F\) of at most \(f\) edges, the oracle returns an estimate \(\widehat{D}\) of the \(ST\)-diameter \(\mathrm{diam(G-F,S,T)\), the maximum distance between vertices in \(S\) and \(T\) in \(G-F\). The oracle has stretch \(\sigma \geq 1\) if \(\mathrm{diam(G-F,S,T) leq \widehat{D leq sigma \mathrm{diam(G-F,S,T)\). If \(S\) and \(T\) both contain all vertices, the data structure is called an \(f\)-edge fault-tolerant diameter oracle (\(f\)-FDO). An \(f\)-edge fault-tolerant distance sensitivity oracles (\(f\)-DSO) estimates the pairwise graph distances under up to \(f\) failures. We design new \(f\)-FDOs and \(f\)-FDO-\(ST\)s by reducing their construction to that of all-pairs and single-source \(f\)-DSOs. We obtain several new tradeoffs between the size of the data structure, stretch guarantee, query and preprocessing times for diameter oracles by combining our black-box reductions with known results from the literature. We also provide an information-theoretic lower bound on the space requirement of approximate \(f\)-FDOs. We show that there exists a family of graphs for which any \(f\)-FDO with sensitivity \(f \ge 2\) and stretch less than \(5/3\) requires \(\Omega(n^{3/2})\) bits of space, regardless of the query time.
Bilò, Davide; Chechik, Shiri; Choudhary, Keerti; Cohen, Sarel; Friedrich, Tobias; Krogmann, Simon; Schirneck, Martin Approximate Distance Sensitivity Oracles in Subquadratic SpaceSymposium on Theory of Computing (STOC) 2023: 1396–1409
An \(f\)-edge fault-tolerant distance sensitive oracle (\(f\)-DSO) with stretch \(\sigma\ge1\) is a data structure that preprocesses a given undirected, unweighted graph \(G\) with \(n\) vertices and \(m\) edges, and a positive integer \(f\). When queried with a pair of vertices \(s,t\) and a set \(F\) of at most \(f\) edges, it returns a \(\sigma\)-approximation of the \(s\)-\(t\)-distance in \(G-F\). We study \(f\)-DSOs that take subquadratic space. Thorup and Zwick [JACM 2015] showed that this is only possible for \(\sigma\geq3\). We present, for any constant \(f\geq1\) and \(\alpha\in(0,\frac{1}{2})\), and any \(\varepsilon>0\), an \(f\)-DSO with stretch \(3+\varepsilon\) that takes \(\tilde{O}(n^{2-\frac{\alpha}{f+1}}/\varepsilon)\cdot{}O(\log n/\varepsilon)^{f+1}\) space and has an \(O(n^\alpha/\varepsilon^2)\) query time. We also give an improved construction for graphs with diameter at most \(D\). For any constant \(k\), we devise an \(f\)-DSO with stretch \(2k-1\) that takes \(O(D^{f+o(1)}n^{1+1/k})\) space and has \(\tilde{O}(D^o(1)})\) query time, with a preprocessing time of \(O(D^{f+o(1)}mn^{1/k})\). Chechik, Cohen, Fiat, and Kaplan [SODA 2017] presented an \(f\)-DSO with stretch \(1{+}\varepsilon\) and preprocessing time \(\tilde{O}_\varepsilon(n^5)\), albeit with a super-quadratic space requirement. We show how to reduce their preprocessing time to \(O(mn^{2})\cdot{}O(\log n/\varepsilon)^{f}\).
Bilò, Davide; Choudhary, Keerti; Cohen, Sarel; Friedrich, Tobias; Krogmann, Simon; Schirneck, Martin Compact Distance Oracles with Large Sensitivity and Low StretchAlgorithms and Data Structures Symposium (WADS) 2023: 149–163
An \(f\)-edge fault-tolerant distance sensitive oracle (\(f\)-DSO) with stretch \(\sigma \geq 1\) is a data-structure that preprocesses an input graph \(G = (V,E)\). When queried with the triple \((s,t,F)\), where \(s, t \in V\) and \(F \subseteq E\) contains at most \(f\) edges of \(G\), the oracle returns an estimate \(\widehat{d_G-F(s,t)\) of the distance \(d_{G-F}(s,t)\) between \(s\) and \(t\) in the graph \(G-F\) such that \(d_{G-F}(s,t) leq \widehat{d_G-F(s,t) leq sigma cdot d_G-F(s,t)\). For any positive integer \(k \ge 2\) and any \(0 < \alpha < 1\), we present an \(f\)-DSO with sensitivity \(f = o(\log n/\log\log n)\), stretch \(2k-1\), space \(O(n^{1+\frac{1}{k+\alpha+o(1)})\), and an \(\widetildeO(n^1+\frac{1}{k - \frac{\alpha}{k(f+1)}})\) query time. Prior to our work, there were only three known \(f\)-DSOs with subquadratic space. The first one by Chechik et al. [Algorithmica 2012] has a stretch of \((8k-2)(f+1)\), depending on \(f\). Another approach is storing an \(f\)-edge fault-tolerant \((2k-1)\)-spanner of \(G\). The bottleneck is the large query time due to the size of any such spanner, which is \(\Omega(n^{1+1/k})\) under the Erdős girth conjecture. Bilò et al. [STOC 2023] gave a solution with stretch \(3+\varepsilon\), query time \(O(n^{\alpha})\) but space \(O(n^{2-\frac{\alpha}{f+1}})\), approaching the quadratic barrier for large sensitivity. In the realm of subquadratic space, our \(f\)-DSOs are the first ones that guarantee, at the same time, large sensitivity, low stretch, and non-trivial query time. To obtain our results, we use the approximate distance oracles of Thorup and Zwick [JACM 2005], and the derandomization of the \(f\)-DSO of Weimann and Yuster [TALG 2013] that was recently given by Karthik and Parter [SODA 2021].