
Friedrich, Tobias; Krejca, Martin S.; Rothenberger, Ralf; Arndt, Tobias; Hafner, Danijar; Kellermeier, Thomas; Krogmann, Simon; Razmjou, Armin Routing for OnStreet Parking Search using Probabilistic Data. AI Communications 2019: 113124
A significant percentage of urban traffic is caused by the search for parking spots. One possible approach to improve this situation is to guide drivers along routes which are likely to have free parking spots. The task of finding such a route can be modeled as a probabilistic graph problem which is NPcomplete. Thus, we propose heuristic approaches for solving this problem and evaluate them experimentally. For this, we use probabilities of finding a parking spot, which are based on publicly available empirical data from TomTom International B.V. Additionally, we propose a heuristic that relies exclusively on conventional road attributes. Our experiments show that this algorithm comes close to the baseline by a factor of 1.3 in our cost measure. Last, we complement our experiments with results from a field study, comparing the success rates of our algorithms against real human drivers.

Shi, Feng; Schirneck, Martin; Friedrich, Tobias; Kötzing, Timo; Neumann, Frank Erratum to: Reoptimization Time Analysis of Evolutionary Algorithms on Linear Functions Under Dynamic Uniform Constraints. Algorithmica 2019
In the article "Reoptimization Time Analysis of Evolutionary Algorithms on Linear Functions Under Dy namic Uniform Constraints", we claimed a worstcase runtime of \(O(nD \log D)\) and \(O(nD)\) for the MultiObjective Evolutionary Algorithm and the MultiObjective \((\mu+(\lambda, \lambda))\) Genetic Algorithm, respectively, on linear profit functions under dynamic uniform constraint, where \(D = B − B^*\) denotes the difference between the original constraint bound \(B\) and the new one \(B^*\) . The technique used to prove these results contained an error. We correct this mistake and show a weaker bound of \(O(nD^2)\) for both algorithms instead.

Doerr, Benjamin; Doerr, Carola; Kötzing, Timo Solving Problems with Unknown Solution Length at Almost No Extra Cost. Algorithmica 2019: 703748

Shi, Feng; Schirneck, Martin; Friedrich, Tobias; Kötzing, Timo; Neumann, Frank Reoptimization Time Analysis of Evolutionary Algorithms on Linear Functions Under Dynamic Uniform Constraints. Algorithmica 2019: 828857
Rigorous runtime analysis is a major approach towards understanding evolutionary computing techniques, and in this area linear pseudoBoolean objective functions play a central role. Having an additional linear constraint is then equivalent to the NPhard Knapsack problem, certain classes thereof have been studied in recent works. In this article, we present a dynamic model of optimizing linear functions under uniform constraints. Starting from an optimal solution with respect to a given constraint bound, we investigate the runtimes that different evolutionary algorithms need to recompute an optimal solution when the constraint bound changes by a certain amount. The classical \((1+1)\) EA and several populationbased algorithms are designed for that purpose, and are shown to recompute efficiently. Furthermore, a variant of the \((1+(\lambda,\lambda))\) GA for the dynamic optimization problem is studied, whose performance is better when the change of the constraint bound is small.

Doerr, Benjamin; Fischbeck, Philipp; Frahnow, Clemens; Friedrich, Tobias; Kötzing, Timo; Schirneck, Martin Island Models Meet Rumor Spreading. Algorithmica 2019: 886915
Island models in evolutionary computation solve problems by a careful interplay of independently running evolutionary algorithms on the island and an exchange of good solutions between the islands. In this work, we conduct rigorous run time analyses for such island models trying to simultaneously obtain good run times and low communication effort. We improve the existing upper bounds for both measures (i) by improving the run time bounds via a careful analysis, (ii) by balancing individual computation and communication in a more appropriate manner, and (iii) by replacing the usual communicatewithall approach with randomized rumor spreading. In the latter, each island contacts a randomly chosen neighbor. This epidemic communication paradigm is known to lead to very fast and robust information dissemination in many applications. Our results concern island models running simple (1+1) evolutionary algorithms to optimize the classic test functions OneMax and LeadingOnes. We investigate binary trees, ddimensional tori, and complete graphs as communication topologies.

Gao, Ziyuan; Jain, Sanjay; Khoussainov, Bakhadyr; Li, Wei; Melnikov, Alexander; Seidel, Karen; Stephan, Frank Random Subgroups of Rationals. arXiv preprint arXiv:1901.04743 2019
This paper introduces and studies a notion of algorithmic randomness for subgroups of rationals. Given a randomly generated additive subgroup \((G,+)\) of rationals, two main questions are addressed: first, what are the modeltheoretic and recursiontheoretic properties of \((G,+)\); second, what learnability properties can one extract from \(G\) and its subclass of finitely generated subgroups? For the first question, it is shown that the theory of \((G,+)\) coincides with that of the additive group of integers and is therefore decidable; furthermore, while the word problem for \(G\) with respect to any generating sequence for \(G\) is not even semidecidable, one can build a generating sequence \(\beta\) such that the word problem for \(G\) with respect to \(\beta\) is corecursively enumerable (assuming that the set of generators of \(G\) is limitrecursive). In regard to the second question, it is proven that there is a generating sequence \(\beta\) for \(G\) such that every nontrivial finitely generated subgroup of \(G\) is recursively enumerable and the class of all such subgroups of \(G\) is behaviourally correctly learnable, that is, every nontrivial finitely generated subgroup can be semantically identified in the limit (again assuming that the set of generators of \(G\) is limitrecursive). On the other hand, the class of nontrivial finitely generated subgroups of \(G\) cannot be syntactically identified in the limit with respect to any generating sequence for \(G\). The present work thus contributes to a recent line of research studying algorithmically random infinite structures and uncovers an interesting connection between the arithmetical complexity of the set of generators of a randomly generated subgroup of rationals and the learnability of its finitely generated subgroups.

Trubenova, Barbora; Kötzing, Timo; Krejca, Martin S.; Lehre, Per Kristian Surfing on the seascape: Adaptation in a changing environment. Evolution: International Journal of Organic Evolution 2019: 13561374
The environment changes constantly at various time scales and, in order to survive, species need to keep adapting. Whether these species succeed in avoiding extinction is a major evolutionary question. Using a multilocus evolutionary model of a mutation‐limited population adapting under strong selection, we investigate the effects of the frequency of environmental fluctuations on adaptation. Our results rely on an “adaptive‐walk” approximation and use mathematical methods from evolutionary computation theory to investigate the interplay between fluctuation frequency, the similarity of environments, and the number of loci contributing to adaptation. First, we assume a linear additive fitness function, but later generalize our results to include several types of epistasis. We show that frequent environmental changes prevent populations from reaching a fitness peak, but they may also prevent the large fitness loss that occurs after a single environmental change. Thus, the population can survive, although not thrive, in a wide range of conditions. Furthermore, we show that in a frequently changing environment, the similarity of threats that a population faces affects the level of adaptation that it is able to achieve. We check and supplement our analytical results with simulations.

Battaglia, Francesco; Cucina, Domenico; Rizzo, Manuel Parsimonious periodic autoregressive models for time series with evolving trend and seasonality. Statistics and Computing 2019
This paper proposes an extension of Periodic AutoRegressive (PAR) modelling for time series with evolving features. The large scale of modern datasets, in fact, implies that the time span may subtend several evolving patterns of the underlying series, affecting also seasonality. The proposed model allows several regimes in time and a possibly different PAR process with a trend term in each regime. The means, autocorrelations and residual variances may change both with the regime and the season, resulting in a very large number of parameters. Therefore as a second step we propose a grouping procedure on the PAR parameters, in order to obtain a more parsimonious and concise model. The model selection procedure is a complex combinatorial problem, and it is solved basing on Genetic Algorithms that optimize an information criterion. The model is tested in both simulation studies and real data analysis from different fields, proving to be effective for a wide range of series with evolving features, and competitive with respect to more specific models.

Cucina, Domenico; Rizzo, Manuel; Ursu, Eugen Multiple changepoint detection for periodic autoregressive models with an application to river flow analysis. Stochastic Environmental Research and Risk Assessment 2019: 11371157

Kötzing, Timo; Krejca, Martin S. FirstHitting Times Under Drift. Theoretical Computer Science 2019
For the last ten years, almost every theoretical result concerning the expected run time of a randomized search heuristic used drift theory, making it the arguably most important tool in this domain. Its success is due to its ease of use and its powerful result: drift theory allows the user to derive bounds on the expected firsthitting time of a random process by bounding expected local changes of the process – the drift. This is usually far easier than bounding the expected firsthitting time directly. Due to the widespread use of drift theory, it is of utmost importance to have the best drift theorems possible. We improve the fundamental additive, multiplicative, and variable drift theorems by stating them in a form as general as possible and providing examples of why the restrictions we keep are still necessary. Our additive drift theorem for upper bounds only requires the process to be lowerbounded, that is, we remove unnecessary restrictions like a finite, discrete, or bounded state space. As corollaries, the same is true for our upper bounds in the case of variable and multiplicative drift. By bounding the step size of the process, we derive new lowerbounding multiplicative and variable drift theorems. Last, we also state theorems that are applicable when the process has a drift of 0, by using a drift on the variance of the process.

Friedrich, Tobias; Kötzing, Timo; Lagodzinski, J. A. Gregor; Neumann, Frank; Schirneck, Martin Analysis of the (1+1) EA on Subclasses of Linear Functions under Uniform and Linear Constraints. Theoretical Computer Science 2019
Linear functions have gained great attention in the run time analysis of evolutionary computation methods. The corresponding investigations have provided many effective tools for analyzing more complex problems. So far, the runtime analysis of evolutionary algorithms has mainly focused on unconstrained problems, but problems occurring in applications frequently involve constraints. Therefore, there is a strong need to extend the methods for analyzing unconstrained problems to a setting involving constraints. In this paper, we consider the behavior of the classical (1+1) evolutionary algorithm on linear functions under linear constraint. We show tight bounds in the case where the constraint is given by the OneMax function and the objective function is given by either the OneMax or the BinVal function. For the general case we present upper and lower bounds.

Friedrich, Tobias; Kötzing, Timo; Krejca, Martin S. Unbiasedness of EstimationofDistribution Algorithms. Theoretical Computer Science 2019: 4659
In the context of blackbox optimization, blackbox complexity is used for understanding the inherent difficulty of a given optimization problem. Central to our understanding of natureinspired search heuristics in this context is the notion of unbiasedness. Specialized blackbox complexities have been developed in order to better understand the limitations of these heuristics – especially of (populationbased) evolutionary algorithms (EAs). In contrast to this, we focus on a model for algorithms explicitly maintaining a probability distribution over the search space: socalled estimationofdistribution algorithms (EDAs). We consider the recently introduced \(n\)Bernoulli\(\lambda\)EDA framework, which subsumes, for example, the commonly known EDAs PBIL, UMDA, \(\lambda\)MMAS\(_\textrm{IB}\), and cGA. We show that an \(n\)Bernoulli\(\lambda\)EDA is unbiased if and only if its probability distribution satisfies a certain invariance property under isometric automorphisms of \([0, 1]^n\). By restricting how an \(n\)Bernoulli\(\lambda\)EDA can perform an update, in a way common to many examples, we derive conciser characterizations, which are easy to verify. We demonstrate this by showing that our examples above are all unbiased.

Bilò, Davide; Lenzner, Pascal On the Tree Conjecture for the Network Creation Game. Theory of Computing Systems 2019
Selfish Network Creation focuses on modeling real world networks from a gametheoretic point of view. One of the classic models by Fabrikant et al. (2003) is the network creation game, where agents correspond to nodes in a network which buy incident edges for the price of \($\alpha$\) per edge to minimize their total distance to all other nodes. The model is wellstudied but still has intriguing open problems. The most famous conjectures state that the price of anarchy is constant for all \($\alpha$\) and that for \($\alpha \geq n$\) all equilibrium networks are trees. We introduce a novel technique for analyzing stable networks for high edgeprice \($\alpha$\) and employ it to improve on the best known bound for the latter conjecture. In particular we show that for \($\alpha>4n −13$\) all equilibrium networks must be trees, which implies a constant price of anarchy for this range of \($\alpha$\). Moreover, we also improve the constant upper bound on the price of anarchy for equilibrium trees.

Krejca, Martin S.; Witt, Carsten Theory of EstimationofDistribution Algorithms. Theory of Evolutionary Computation: Recent Developments in Discrete Optimization 2019: 406442
Estimationofdistribution algorithms (EDAs) are general metaheuristics used in optimization that represent a more recent alternative to classical approaches like evolutionary algorithms. In a nutshell, EDAs typically do not directly evolve populations of search points but build probabilistic models of promising solutions by repeatedly sampling and selecting points from the underlying search space. Recently, there has been made significant progress in the theoretical understanding of EDAs. This article provides an uptodate overview of the most commonly analyzed EDAs and the most recent theoretical results in this area. In particular, emphasis is put on the runtime analysis of simple univariate EDAs, including a description of typical benchmark functions and tools for the analysis. Along the way, open problems and directions for future research are described.

Friedrich, Tobias; Göbel, Andreas; Neumann, Frank; Quinzan, Francesco; Rothenberger, Ralf Greedy Maximization of Functions with Bounded Curvature Under Partition Matroid Constraints. Conference on Artificial Intelligence (AAAI) 2019: 22722279
We investigate the performance of a deterministic GREEDY algorithm for the problem of maximizing functions under a partition matroid constraint. We consider nonmonotone submodular functions and monotone subadditive functions. Even though constrained maximization problems of monotone submodular functions have been extensively studied, little is known about greedy maximization of nonmonotone submodular functions or monotone subadditive functions. We give approximation guarantees for GREEDY on these problems, in terms of the curvature. We find that this simple heuristic yields a strong approximation guarantee on a broad class of functions. We discuss the applicability of our results to three realworld problems: Maximizing the determinant function of a positive semidefinite matrix, and related problems such as the maximum entropy sampling problem, the constrained maximum cut problem on directed graphs, and combinatorial auction games. We conclude that GREEDY is wellsuited to approach these problems. Overall, we present evidence to support the idea that, when dealing with constrained maximization problems with bounded curvature, one needs not search for (approximate) monotonicity to get good approximate solutions.

Roostapour, Vahid; Neumann, Aneta; Neumann, Frank; Friedrich, Tobias Pareto Optimization for Subset Selection with Dynamic Cost Constraints. Conference on Artificial Intelligence (AAAI) 2019: 23542361
In this paper, we consider subset selection problems for functions \(f\) with constraints where the constraint bound \(B\) changes over time. We point out that adaptive variants of greedy approaches commonly used in the area of submodular optimization are not able to maintain their approximation quality. Investigating the recently introduced POMC Pareto optimization approach, we show that this algorithm efficiently computes a \( phi= (\alpha_f/2)(1\frac{1}{e^{\alpha_f}})\)approximation, where \(\alpha_f\) is the submodularity ratio, for each possible constraint bound \(b \leq B\). Furthermore, we show that POMC is able to adapt its set of solutions quickly in the case that \(B\) increases. Our experimental investigations for the influence maximization in social networks show the advantage of POMC over generalized greedy algorithms.

Feldotto, Matthias; Lenzner, Pascal; Molitor, Louise; Skopalik, Alexander From Hotelling to Load Balancing: Approximation and the Principle of Minimum Differentiation. Autonomous Agents and Multiagent Systems (AAMAS) 2019: 19491951
Competing firms tend to select similar locations for their stores. This phenomenon, called the principle of minimum differentiation, was captured by Hotelling with a landmark model of spatial competition but is still the object of an ongoing scientific debate. Although consistently observed in practice, many more realistic variants of Hotelling's model fail to support minimum differentiation or do not have pure equilibria at all. In particular, it was recently proven for a generalized model which incorporates negative network externalities and which contains Hotelling's model and classical selfish load balancing as special cases, that the unique equilibria do not adhere to minimum differentiation. Furthermore, it was shown that for a significant parameter range pure equilibria do not exist. We derive a sharp contrast to these previous results by investigating Hotelling's model with negative network externalities from an entirely new angle: approximate pure subgame perfect equilibria. This approach allows us to prove analytically and via agentbased simulations that approximate equilibria having good approximation guarantees and that adhere to minimum differentiation exist for the full parameter range of the model. Moreover, we show that the obtained approximate equilibria have high social welfare.

Bläsius, Thomas; Friedrich, Tobias; Lischeid, Julius; Meeks, Kitty; Schirneck, Martin Efficiently Enumerating Hitting Sets of Hypergraphs Arising in Data Profiling. Algorithm Engineering and Experiments (ALENEX) 2019: 130143
We devise an enumeration method for inclusionwise minimal hitting sets in hypergraphs. It has delay \(O(m^{k^\ast+1} \cdot n^2)\) and uses linear space. Hereby, \(n\) is the number of vertices, \(m\) the number of hyperedges, and \(k^\ast\) the rank of the transversal hypergraph. In particular, on classes of hypergraphs for which the cardinality \(k^\ast\) of the largest minimal hitting set is bounded, the delay is polynomial. The algorithm solves the extension problem for minimal hitting sets as a subroutine. We show that the extension problem is W[3]complete when parameterised by the cardinality of the set which is to be extended. For the subroutine, we give an algorithm that is optimal under the exponential time hypothesis. Despite these lower bounds, we provide empirical evidence showing that the enumeration outperforms the theoretical worstcase guarantee on hypergraphs arising in the profiling of relational databases, namely, in the detection of unique column combinations.

Fokina, Ekaterina; Kötzing, Timo; San Mauro, Luca Limit Learning Equivalence Structures. Algorithmic Learning Theory (ALT) 2019: 120
While most research in Goldstyle learning focuses on learning formal languages, we consider the identification of computable structures, specifically equivalence structures. In our core model the learner gets more and more information about which pairs of elements of a structure are related and which are not. The aim of the learner is to find (an effective description of) the isomorphism type of the structure presented in the limit. In accordance with language learning we call this learning criterion InfExlearning (explanatory learning from informant). We start with a discussion and separations of different variants of this learning criterion, including learning from text (where the only information provided is which elements are related, and not which elements are not related) and finite learning (where the first actual conjecture of the learner has to be correct). This gives first intuitions and examples for what (classes of) structures are learnable and which are not. Our main contribution is a complete characterization of the learnable classes of structures in terms of a combinatorial property of the classes. This property allows us to derive a bound of \(\mathbf{0''}\) on the computational complexity required to learn uniformly enumerable classes of structures. Finally, we show how learning classes of structures relates to learning classes of languages by mapping learning tasks for structures to equivalent learning tasks for languages.

Casel, Katrin; Fernau, Henning; Khosravian Ghadikolaei, Mehdi; Monnot, Jerome; Sikora, Florian Extension of vertex cover and independent set in some classes of graphs and generalizations. International Conference on Algorithms and Complexity (CIAC) 2019
We study extension variants of the classical problems Vertex Cover and Independent Set. Given a graph \(G = (V, E)\) and a vertex set \(U \subseteq V\), it is asked if there exists a minimal vertex cover (resp. maximal independent set) \(S\) with \(U \subseteq S\) (resp. \(U \supseteq S\)). Possibly contradicting intuition, these problems tend to be NPcomplete, even in graph classes where the classical problem can be solved efficiently. Yet, we exhibit some graph classes where the extension variant remains polynomialtime solvable. We also study the parameterized complexity of theses problems, with parameter \(U\), as well as the optimality of simple exact algorithms under ETH. All these complexity considerations are also carried out in very restricted scenarios, be it degree or topological restrictions (bipartite, planar or chordal graphs). This also motivates presenting some explicit branching algorithms for degreebounded instances. We further discuss the price of extension, measuring the distance of \(U\) to the closest set that can be extended, which results in natural optimization problems related to extension problems for which we discuss polynomialtime approximability.

Bläsius, Thomas; Friedrich, Tobias; Katzmann, Maximilian; Meyer, Ulrich; Penschuck, Manuel; Weyand, Christopher Efficiently Generating Geometric Inhomogeneous and Hyperbolic Random Graphs. European Symposium on Algorithms (ESA) 2019: 21:221:14
Hyperbolic random graphs (HRG) and geometric inhomogeneous random graphs (GIRG) are two similar generative network models that were designed to resemble complex real world networks. In particular, they have a powerlaw degree distribution with controllable exponent \(\beta\), and high clustering that can be controlled via the temperature \(T\). We present the first implementation of an efficient GIRG generator running in expected linear time. Besides varying temperatures, it also supports underlying geometries of higher dimensions. It is capable of generating graphs with ten million edges in under a second on commodity hardware. The algorithm can be adapted to HRGs. Our resulting implementation is the fastest sequential HRG generator, despite the fact that we support nonzero temperatures. Though nonzero temperatures are crucial for many applications, most existing generators are restricted to \(T = 0\). We also support parallelization, although this is not the focus of this paper. Moreover, we note that our generators draw from the correct probability distribution, i.e., they involve no approximation. Besides the generators themselves, we also provide an efficient algorithm to determine the nontrivial dependency between the average degree of the resulting graph and the input parameters of the GIRG model. This makes it possible to specify \(\bar{d}\) as input and obtain a graph with expected average degree \(\bar{d}\). Moreover, we investigate the differences between HRGs and GIRGs, shedding new light on the nature of the relation between the two models. Although HRGs represent, in a certain sense, a special case of the GIRG model, we find that a straightforward inclusion does not hold in practice. However, the difference is negligible for most use cases.

Casel, Katrin; Fernau, Henning; Khosravian Ghadikolaei, Mehdi; Monnot, Jerome; Sikora, Florian Extension of some edge graph problems: standard and parameterized complexity. Fundamentals of Computation Theory (FCT) 2019
We consider extension variants of some edge optimization problems in graphs containing the classical Edge Cover, Matching, and Edge Dominating Set problems. Given a graph \(G=(V,E)\) and an edge set \(U \subseteq E\), it is asked whether there exists an inclusionwise minimal (resp., maximal) feasible solution \(E'\) which satisfies a given property, for instance, being an edge dominating set (resp., a matching) and containing the forced edge set \(U\) (resp., avoiding any edges from the forbidden edge set \(E \setminus U\)). We present hardness results for these problems, for restricted instances such as bipartite or planar graphs. We counterbalance these negative results with parameterized complexity results. We also consider the price of extension, a natural optimization problem variant of extension problems, leading to some approximation results.

Doerr, Benjamin; Kötzing, Timo Multiplicative UpDrift. Genetic and Evolutionary Computation Conference (GECCO) 2019
Drift analysis aims at translating the expected progress of an evo lutionary algorithm (or more generally, a random process) into a probabilistic guarantee on its run time (hitting time). So far, drift arguments have been successfully employed in the rigorous analy sis of evolutionary algorithms, however, only for the situation that the progress is constant or becomes weaker when approaching the target. Motivated by questions like how fast fit individuals take over a population, we analyze random processes exhibiting a multiplica tive growth in expectation. We prove a drift theorem translating this expected progress into a hitting time. This drift theorem gives a sim ple and insightful proof of the levelbased theorem first proposed by Lehre (2011). Our version of this theorem has, for the first time, the bestpossible linear dependence on the growth parameter \(\delta\) (the previousbest was quadratic). This gives immediately stronger run time guarantees for a number of applications.

Casel, Katrin; Day, Joel D.; Fleischmann, Pamela; Kociumaka, Tomasz; Manea, Florin; Schmid, Markus L. Graph and String Parameters: Connections Between Pathwidth, Cutwidth and the Locality Number. International Colloquium on Automata, Languages and Programming (ICALP) 2019
We investigate the locality number, a recently introduced structural parameter for strings (with applications in pattern matching with variables), and its connection to two important graphparameters, cutwidth and pathwidth. These connections allow us to show that computing the locality number is NPhard but fixed parameter tractable (when the locality number or the alphabet size is treated as a parameter), and can be approximated with ratio \(O(\sqrt{\log opt \log n)\). As a byproduct, we also relate cutwidth via the locality number to pathwidth, which is of independent interest, since it improves the currently best known approximation algorithm for cutwidth. In addition to these main results, we also consider the possibility of greedybased approximation algorithms for the locality number.

Friedrich, Tobias; Rothenberger, Ralf The Satisfiability Threshold for NonUniform Random 2SAT. International Colloquium on Automata, Languages and Programming (ICALP) 2019: 61:161:14
Propositional satisfiability (SAT) is one of the most fundamental problems in computer science. Its worstcase hardness lies at the core of computational complexity theory, for example in the form of NPhardness and the (Strong) Exponential Time Hypothesis. In practice however, SAT instances can often be solved efficiently. This contradicting behavior has spawned interest in the averagecase analysis of SAT and has triggered the development of sophisticated rigorous and nonrigorous techniques for analyzing random structures. Despite a long line of research and substantial progress, most theoretical work on random SAT assumes a uniform distribution on the variables. In contrast, realworld instances often exhibit large fluctuations in variable occurrence. This can be modeled by a nonuniform distribution of the variables, which can result in distributions closer to industrial SAT instances. We study satisfiability thresholds of nonuniform random 2SAT with n variables and m clauses and with an arbitrary probability distribution \((p_i)_{i \in [n]}\) with \(p_1 \geq p_2 \geq \dots \geq p_n > 0\) over the n variables. We show for \(p_1^2 = \Theta(\sum_{i=1^n p_i^2)\) that the asymptotic satisfiability threshold is at \(m = \Theta((1− \sum_{i=1^n p_i^2) / (p_1 cdot (\sum_{i=2^n p_i^2)^{1/2}))\) and that it is coarse. For \(p_1^2 = o( \sum_{i=1^n p_i^2)\) we show that there is a sharp satisfiability threshold at \(m = (\sum_{i=1^n p_i^2)^{−1}\). This result generalizes the seminal works by Chvatal and Reed [FOCS 1992] and by Goerdt [JCSS 1996].

Peters, Jannik; Stephan, Daniel; Amon, Isabel; Gawendowicz, Hans; Lischeid, Julius; Salabarria, Julius; Umland, Jonas; Werner, Felix; Krejca, Martin S.; Rothenberger, Ralf; Kötzing, Timo; Friedrich, Tobias Mixed Integer Programming versus Evolutionary Computation for Optimizing a Hard RealWorld Staff Assignment Problem. International Conference on Automated Planning and Scheduling (ICAPS) 2019: 541554
Assigning staff to engagements according to hard constraints while optimizing several objectives is a task encountered by many companies on a regular basis. Simplified versions of such assignment problems are NPhard. Despite this, a typical approach to solving them consists of formulating them as mixed integer programming (MIP) problems and using a stateoftheart solver to get solutions that closely approximate the optimum. In this paper, we consider a complex realworld staff assignment problem encountered by the professional service company KPMG, with the goal of finding an algorithm that solves it faster and with a better solution than a commercial MIP solver. We follow the evolutionary algorithm (EA) metaheuristic and design a search heuristic which iteratively improves a solution using domainspecific mutation operators. Furthermore, we use a flow algorithm to optimally solve a subproblem, which tremendously reduces the search space for the EA. For our realworld instance of the assignment problem, given the same total time budget of \(100\) hours, a parallel EA approach finds a solution that is only \(1.7\)% away from an upper bound for the (unknown) optimum within under five hours, while the MIP solver Gurobi still has a gap of \(10.5\) %.

Friedrich, Tobias; Rothenberger, Ralf Sharpness of the Satisfiability Threshold for NonUniform Random kSAT. International Joint Conference on Artificial Intelligence (IJCAI) 2019: 61516155
Extended abstract
We study nonuniform random kSAT on n variables with an arbitrary probability distribution p on the variable occurrences. The number \(t = t(n)\) of randomly drawn clauses at which random formulas go from asymptotically almost surely (a.a.s.) satisfiable to a.a.s. unsatisfiable is called the satisfiability threshold. Such a threshold is called sharp if it approaches a step function as n increases. We show that a threshold t(n) for random kSAT with an ensemble \((p_n)_{n\in\mathbb{N}}\) of arbitrary probability distributions on the variable occurrences is sharp if \(\p\_2^2 = O_n(t^{2/k})\) and \(\p\_infty = o_n(t^k/(2k1) \log^{(k1)/(2k1)(t))\). This result generalizes Friedgut’s sharpness result from uniform to nonuniform random kSAT and implies sharpness for thresholds of a wide range of random kSAT models with heterogeneous probability distributions, for example such models where the variable probabilities follow a powerlaw distribution.

Gao, Ziyuan; Jain, Sanjay; Khoussainov, Bakhadyr; Li, Wei; Melnikov, Alexander; Seidel, Karen; Stephan, Frank Random Subgroups of Rationals. International Symposium on Mathematical Foundations of Computer Science (MFCS) 2019: 25:125:14
This paper introduces and studies a notion of algorithmic randomness for subgroups of rationals. Given a randomly generated additive subgroup \((G,+)\) of rationals, two main questions are addressed: first, what are the modeltheoretic and recursiontheoretic properties of \((G,+)\); second, what learnability properties can one extract from \(G\) and its subclass of finitely generated subgroups? For the first question, it is shown that the theory of \((G,+)\) coincides with that of the additive group of integers and is therefore decidable; furthermore, while the word problem for \(G\) with respect to any generating sequence for \(G\) is not even semidecidable, one can build a generating sequence \(\beta\) such that the word problem for \(G\) with respect to \(\beta\) is corecursively enumerable (assuming that the set of generators of \(G\) is limitrecursive). In regard to the second question, it is proven that there is a generating sequence \(\beta\) for \(G\) such that every nontrivial finitely generated subgroup of \(G\) is recursively enumerable and the class of all such subgroups of \(G\) is behaviourally correctly learnable, that is, every nontrivial finitely generated subgroup can be semantically identified in the limit (again assuming that the set of generators of \(G\) is limitrecursive). On the other hand, the class of nontrivial finitely generated subgroups of \(G\) cannot be syntactically identified in the limit with respect to any generating sequence for \(G\). The present work thus contributes to a recent line of research studying algorithmically random infinite structures and uncovers an interesting connection between the arithmetical complexity of the set of generators of a randomly generated subgroup of rationals and the learnability of its finitely generated subgroups.

Bilò, Davide; Friedrich, Tobias; Lenzner, Pascal; Melnichenko, Anna Geometric Network Creation Games. Symposium on Parallelism in Algorithms and Architectures (SPAA) 2019: 323332
Network Creation Games are a wellknown approach for explaining and analyzing the structure, quality and dynamics of realworld networks like the Internet and other infrastructure networks which evolved via the interaction of selfish agents without a central authority. In these games selfish agents which correspond to nodes in a network strategically buy incident edges to improve their centrality. However, past research on these games has only considered the creation of networks with unitweight edges. In practice, e.g. when constructing a fiberoptic network, the choice of which nodes to connect and also the induced price for a link crucially depends on the distance between the involved nodes and such settings can be modeled via edgeweighted graphs. We incorporate arbitrary edge weights by generalizing the wellknown model by Fabrikant et al.~[PODC'03] to edgeweighted host graphs and focus on the geometric setting where the weights are induced by the distances in some metric space. In stark contrast to the stateoftheart for the unitweight version, where the Price of Anarchy is conjectured to be constant and where resolving this is a major open problem, we prove a tight nonconstant bound on the Price of Anarchy for the metric version and a slightly weaker upper bound for the nonmetric case. Moreover, we analyze the existence of equilibria, the computational hardness and the game dynamics for several natural metrics. The model we propose can be seen as the gametheoretic analogue of a variant of the classical Network Design Problem. Thus, lowcost equilibria of our game correspond to decentralized and stable approximations of the optimum network design.

Friedrich, Tobias From Graph Theory to Network Science: The Natural Emergence of Hyperbolicity. Symposium Theoretical Aspects of Computer Science (STACS) 2019: 5:1–5:9
Network science is driven by the question which properties large realworld networks have and how we can exploit them algorithmically. In the past few years, hyperbolic graphs have emerged as a very promising model for scalefree networks. The connection between hyperbolic geometry and complex networks gives insights in both directions: (1) Hyperbolic geometry forms the basis of a natural and explanatory model for realworld networks. Hyperbolic random graphs are obtained by choosing random points in the hyperbolic plane and connecting pairs of points that are geometrically close. The resulting networks share many structural properties for example with online social networks like Facebook or Twitter. They are thus well suited for algorithmic analyses in a more realistic setting. (2) Starting with a realworld network, hyperbolic geometry is wellsuited for metric embeddings. The vertices of a network can be mapped to points in this geometry, such that geometric distances are similar to graph distances. Such embeddings have a variety of algorithmic applications ranging from approximations based on efficient geometric algorithms to greedy routing solely using hyperbolic coordinates for navigation decisions.

Bläsius, Thomas; Friedrich, Tobias; Sutton, Andrew M. On the Empirical Time Complexity of ScaleFree 3SAT at the Phase Transition. Tools and Algorithms for the Construction and Analysis of Systems (TACAS) 2019: 117134
The hardness of formulas at the solubility phase transition of random propositional satisfiability (SAT) has been intensely studied for decades both empirically and theoretically. Solvers based on stochastic local search (SLS) appear to scale very well at the critical threshold, while complete backtracking solvers exhibit exponential scaling. On industrial SAT instances, this phenomenon is inverted: backtracking solvers can tackle large industrial problems, where SLSbased solvers appear to stall. Industrial instances exhibit sharply different structure than uniform random instances. Among many other properties, they are often heterogeneous in the sense that some variables appear in many while others appear in only few clauses. We conjecture that the heterogeneity of SAT formulas alone already contributes to the tradeoff in performance between SLS solvers and complete backtracking solvers. We empirically determine how the run time of SLS vs. backtracking solvers depends on the heterogeneity of the input, which is controlled by drawing variables according to a scalefree distribution. Our experiments reveal that the efficiency of complete solvers at the phase transition is strongly related to the heterogeneity of the degree distribution. We report results that suggest the depth of satisfying assignments in complete search trees is influenced by the level of heterogeneity as measured by a powerlaw exponent. We also find that incomplete SLS solvers, which scale well on uniform instances, are not affected by heterogeneity. The main contribution of this paper utilizes the scalefree random 3SAT model to isolate heterogeneity as an important factor in the scaling discrepancy between complete and SLS solvers at the uniform phase transition found in previous works.

Bläsius, Thomas; Fischbeck, Philipp; Friedrich, Tobias; Schirneck, Martin Understanding the Effectiveness of Data Reduction in Public Transportation Networks. Workshop on Algorithms and Models for the Web Graph (WAW) 2019: 87101
Given a public transportation network of stations and connections, we want to find a minimum subset of stations such that each connection runs through a selected station. Although this problem is NPhard in general, realworld instances are regularly solved almost completely by a set of simple reduction rules. To explain this behavior, we view transportation networks as hitting set instances and identify two characteristic properties, locality and heterogeneity. We then devise a randomized model to generate hitting set instances with adjustable properties. While the heterogeneity does influence the effectiveness of the reduction rules, the generated instances show that locality is the significant factor. Beyond that, we prove that the effectiveness of the reduction rules is independent of the underlying graph structure. Finally, we show that high locality is also prevalent in instances from other domains, facilitating a fast computation of minimum hitting sets.

Echzell, Hagen; Friedrich, Tobias; Lenzner, Pascal; Molitor, Louise; Pappik, Marcus; Schöne, Friedrich; Sommer, Fabian; Stangl, David Convergence and Hardness of Strategic Schelling Segregation. Web and Internet Economics (WINE) 2019
The phenomenon of residential segregation was captured by Schelling's famous segregation model where two types of agents are placed on a grid and an agent is content with her location if the fraction of her neighbors which have the same type as her is at least \($\tau$\), for some \($0<\tau<1$\). Discontent agents simply swap their location with a randomly chosen other discontent agent or jump to a random empty cell. We analyze a generalized gametheoretic model of Schelling segregation which allows more than two agent types and more general underlying graphs modeling the residential area. For this we show that both aspects heavily influence the dynamic properties and the tractability of finding an optimal placement. We map the boundary of when improving response dynamics (IRD), i.e., the natural approach for finding equilibrium states, are guaranteed to converge. For this we prove several sharp threshold results where guaranteed IRD convergence suddenly turns into the strongest possible nonconvergence result: a violation of weak acyclicity. In particular, we show such threshold results also for Schelling's original model, which is in contrast to the standard assumption in many empirical papers. Furthermore, we show that in case of convergence, IRD find an equilibrium in \($\mathcal{O(m)$\) steps, where \($m$\) is the number of edges in the underlying graph and show that this bound is met in empirical simulations starting from random initial agent placements.

Friedrich, Tobias; Doerr, Carola; Arnold, Dirk V. Proceedings of the 15th ACM/SIGEVO Conference on Foundations of Genetic Algorithms, FOGA 2019, Potsdam, Germany, August 2729, 2019. 2019 ACM.
Editorship

Bläsius, Thomas; Karrer, Annette; Rutter, Ignaz Simultaneous Embedding: Edge Orderings, Relative Positions, Cutvertices. Algorithmica 2018: 12141277
A simultaneous embedding (with fixed edges) of two graphs \(G^1\) and \(G^2\) with common graph \(G = G^1 ∩ G^2\) is a pair of planar drawings of \(G^1\) and \(G^2\) that coincide on \(G\). It is an open question whether there is a polynomialtime algorithm that decides whether two graphs admit a simultaneous embedding (problem SEFE). In this paper, we present two results. First, a set of three lineartime preprocessing algorithms that remove certain substructures from a given SEFE instance, producing a set of equivalent Sefe instances without such substructures. The structures we can remove are (1) cutvertices of the union graph \(G^∪ = G^1 ∪ G^2\) , (2) most separating pairs of \(G^∪\), and (3) connected components of G that are biconnected but not a cycle. Second, we give an \(O(n³)\)time algorithm solving Sefe for instances with the following restriction. Let u be a pole of a Pnode \(µ\) in the SPQRtree of a block of \(G^1\) or \(G^2\). Then at most three virtual edges of \(µ\) may contain common edges incident to u. All algorithms extend to the sunflower case, i.e., to the case of more than two graphs pairwise intersecting in the same common graph.

Kötzing, Timo; Sudholt, Dirk Preface to the Special Issue on Theory of Genetic and Evolutionary Computation. Algorithmica 2018: 15751578
Evolutionary algorithms (EAs) are randomized search heuristics that can be employed to solve complex optimization problems, including multimodal or highly constrained problems. EAs work by mimicking principles from natural evolution: maintaining a collection of possible solutions (a population) and iteratively creating variants of the individuals (the offspring) and then choosing a new set of individuals for the next iteration (selection). EAs are popular because they represent generalpurpose optimizers that can be easily applied to various problems, even in cases where little or no indepth knowledge about the problem is available. In order to guide practitioners devising new and effective algorithms, theoretical computer scientists employ methods from the field of randomized algorithms to analyze the working principles of EAs with mathematical rigor. Key questions concern the impact of parameter choices (such as, for example, the offspring size or the choice of variation operators) as well as foundational work on developing powerful analysis methods. The theory track of the annual ACM Genetic and Evolutionary Computation Conference (GECCO) is the first tier event for advances in this direction. In this special issue six selected papers from the 2016 edition of the GECCO theory track are collected, each one of them carefully revised and extended to meet the high quality standards of Algorithmica.

Doerr, Benjamin; Doerr, Carola; Kötzing, Timo Static and SelfAdjusting Mutation Strengths for Multivalued Decision Variables. Algorithmica 2018: 17321768
The most common representation in evolutionary computation are bit strings. With very little theoretical work existing on how to use evolutionary algorithms for decision variables taking more than two values, we study the run time of simple evolutionary algorithms on some OneMaxlike functions defined over \(\Omega={0,1,\dots,r−1}^n\). We observe a crucial difference in how we extend the onebitflip and standardbit mutation operators to the multivalued domain. While it is natural to modify a random position of the string or select each position of the solution vector for modification independently with probability \(1/n\), there are various ways to then change such a position. If we change each selected position to a random value different from the original one, we obtain an expected run time of \(\Theta(n r \log n)\). If we change each selected position by \(+1\) or \(−1\) (random choice), the optimization time reduces to \(\Theta(n r + n \log n)\). If we use a random mutation strength \(i \in {0,1,\dots,r−1}\) with probability inversely proportional to \(i\) and change the selected position by \(+i\) or \(−i\) (random choice), then the optimization time becomes \(\Theta(n \log(r)(\log n + \log r))\), which is asymptotically faster than the previous if \(r = \omega(\log(n)\log\log(n))\). Interestingly, a better expected performance can be achieved with a selfadjusting mutation strength that is based on the success of previous iterations. For the mutation operator that modifies a randomly chosen position, we show that the selfadjusting mutation strength yields an expected optimization time of \(\Theta(n(\log n + \log r))\), which is best possible among all dynamic mutation strengths. In our proofs, we use a new multiplicative drift theorem for computing lower bounds, which is not restricted to processes that move only towards the target.

Bläsius, Thomas; Friedrich, Tobias; Krohmer, Anton Cliques in Hyperbolic Random Graphs. Algorithmica 2018: 23242344
Most complex real world networks display scalefree features. This characteristic motivated the study of numerous random graph models with a powerlaw degree distribution. There is, however, no established and simple model which also has a high clustering of vertices as typically observed in real data. Hyperbolic random graphs bridge this gap. This natural model has recently been introduced by Krioukov et al. and has shown theoretically and empirically to fulfill all typical properties of real world networks, including powerlaw degree distribution and high clustering. We study cliques in hyperbolic random graphs \(G\) and present new results on the expected number of \(k\)cliques \(E[K_k]\) and the size of the largest clique \(\omega(G)\). We observe that there is a phase transition at powerlaw exponent \(\beta = 3\). More precisely, for \(\beta\)\(\in\)\((2,3)\) we prove \(E[K_k] = \) \(n^{k(3\beta)/2} \Theta(k)^{k}\) and \(\omega(G) = \) \(\Theta\)\((n^{(3\beta)/2})\), while for \(\beta \geq 3\) we prove \(E[K_k]=n \Theta(k)^{k}\) and \(\omega(G)=\Theta(\log(n)/ \log\log n)\). Furthermore, we show that for \(\beta \geq 3\), cliques in hyperbolic random graphs can be computed in time \(O(n)\). If the underlying geometry is known, cliques can be found with worstcase runtime \(O(m n^{2.5})\) for all values of \(\beta\).

AbuKhzam, Faisal N.; Bazgan, Cristina; Casel, Katrin; Fernau, Henning Clustering with LowerBounded Sizes  A General GraphTheoretic Framework. Algorithmica 2018: 25172550
Classical clustering problems search for a partition of objects into a fixed number of clusters. In many scenarios, however, the number of clusters is not known or necessarily fixed. Further, clusters are sometimes only considered to be of significance if they have a certain size. We discuss clustering into sets of minimum cardinality \(k\) without a fixed number of sets and present a general model for these types of problems. This general framework allows the comparison of different measures to assess the quality of a clustering. We specifically consider nine qualitymeasures and classify the complexity of the resulting problems with respect to \(k\). Further, we derive some polynomialtime solvable cases for \(k=2\) with connections to matchingtype problems which, among other graph problems, then are used to compute approximations for larger values of \(k\).

Bringmann, Karl; Friedrich, Tobias; Krohmer, Anton Deanonymization of Heterogeneous Random Graphs in Quasilinear Time. Algorithmica 2018: 3397–3427
There are hundreds of online social networks with altogether billions of users. Many such networks publicly release structural information, with all personal information removed. Empirical studies have shown, however, that this provides a false sense of privacy  it is possible to identify almost all users that appear in two such anonymized network as long as a few initial mappings are known. We analyze this problem theoretically by reconciling two versions of an artificial powerlaw network arising from independent subsampling of vertices and edges. We present a new algorithm that identifies most vertices and makes no wrong identifications with high probability. The number of vertices matched is shown to be asymptotically optimal. For an nvertex graph, our algorithm uses \(n^\varepsilon\) seed nodes (for an arbitrarily small \(\varepsilon\)) and runs in quasilinear time. This improves previous theoretical results which need \(\Theta(n)\) seed nodes and have runtimes of order \(n^{1+\Omega(1)}\). Additionally, the applicability of our algorithm is studied experimentally on different networks.

Aschenbach, Martin; Kötzing, Timo; Seidel, Karen Learning from Informants: Relations between Learning Success Criteria. ArXiv 2018: 37
Learning from positive and negative information, socalled informants, being one of the models for human and machine learning introduced by Gold [1967], is investigated. Particularly, naturally arising questions about this learning setting, originating in results on learning from solely positive information, are answered. By a carefully arranged argument learners can be assumed to only change their hypothesis in case it is inconsistent with the data (such a learning behavior is called conservative). The deduced main theorem states the relations between the most important delayable learning success criteria, being the ones not ruined by a delayed in time hypothesis output. Additionally, our investigations concerning the nondelayable requirement of consistent learning underpin the claim for delayability being the right structural property to gain a deeper understanding concerning the nature of learning success criteria. Moreover, we obtain an anomalous hierarchy when allowing for an increasing finite number of anomalies of the hypothesized language by the learner compared with the language to be learned. In contrast to the vacillatory hierarchy for learning from solely positive information, we observe a duality depending on whether infinitely many vacillations between different (almost) correct hypotheses are still considered a successful learning behavior.

Bläsius, Thomas; Stumpf, Peter; Ueckerdt, Torsten Local and Union Boxicity. Discrete Mathematics 2018: 1307  1315
The boxicity \(box(H)\) of a graph \(H\) is the smallest integer \(d\) such that \(H\) is the intersection of \(d\) interval graphs, or equivalently, that \(H\) is the intersection graph of axisaligned boxes in \(R^d\). These intersection representations can be interpreted as covering representations of the complement \(H^c\) of \(H\) with cointerval graphs, that is, complements of interval graphs. We follow the recent framework of global, local and folded covering numbers (Knauer and Ueckerdt, 2016) to define two new parameters: the local boxicity \(box_l(H)\) and the union boxicity \(box_u(H)\) of \(H\). The union boxicity of \(H\) is the smallest \(d\) such that \(H^c\) can be covered with \(d\) vertexdisjoint unions of cointerval graphs, while the local boxicity of \(H\) is the smallest \(d\) such that \(H^c\) can be covered with cointerval graphs, at most \(d\) at every vertex. We show that for every graph \(H\) we have \(box_l(H) leq box_u(H) leq box(H) \) and that each of these inequalities can be arbitrarily far apart. Moreover, we show that local and union boxicity are also characterized by intersection representations of appropriate axisaligned boxes in \(R^d\) . We demonstrate with a few striking examples, that in a sense, the local boxicity is a better indication for the complexity of a graph, than the classical boxicity.

Dang, DucCuong; Friedrich, Tobias; Kötzing, Timo; Krejca, Martin S.; Lehre, Per Kristian; Oliveto, Pietro S.; Sudholt, Dirk; Sutton, Andrew M. Escaping Local Optima Using Crossover with Emergent Diversity. IEEE Transactions on Evolutionary Computation 2018: 484497
Population diversity is essential for the effective use of any crossover operator. We compare seven commonly used diversity mechanisms and prove rigorous run time bounds for the \((\mu+1)\) GA using uniform crossover on the fitness function \(Jump_k\). All previous results in this context only hold for unrealistically low crossover probability \(p_c=O(k/n)\), while we give analyses for the setting of constant \(p_c < 1\) in all but one case. Our bounds show a dependence on the problem size \(n\), the jump length \(k\), the population size \(\mu\), and the crossover probability \(p_c\). For the typical case of constant \(k > 2\) and constant \(p_c\), we can compare the resulting expected optimisation times for different diversity mechanisms assuming an optimal choice of \(\mu\): \(O(n^{k1})\) for duplicate elimination/minimisation, \(O(n^2 \log n)\) for maximising the convex hull, \(O(n \log n)\) for det. crowding (assuming \(p_c = k/n\)), \(O(n \log n)\) for maximising the Hamming distance, \(O(n \log n)\) for fitness sharing, \(O(n \log n)\) for the singlereceiver island model. This proves a sizeable advantage of all variants of the \((\mu+1)\) GA compared to the (1+1) EA, which requires \(\Theta(n^k)\). In a short empirical study we confirm that the asymptotic differences can also be observed experimentally.

Bläsius, Thomas; Friedrich, Tobias; Krohmer, Anton; Laue, Sören Efficient Embedding of ScaleFree Graphs in the Hyperbolic Plane. IEEE/ACM Transactions on Networking 2018: 920933
Hyperbolic geometry appears to be intrinsic in many large real networks. We construct and implement a new maximum likelihood estimation algorithm that embeds scalefree graphs in the hyperbolic space. All previous approaches of similar embedding algorithms require at least a quadratic runtime. Our algorithm achieves quasilinear runtime, which makes it the first algorithm that can embed networks with hundreds of thousands of nodes in less than one hour. We demonstrate the performance of our algorithm on artificial and real networks. In all typical metrics, like Loglikelihood and greedy routing, our algorithm discovers embeddings that are very close to the ground truth.

Baum, Moritz; Bläsius, Thomas; Gemsa, Andreas; Rutter, Ignaz; Wegner, Franziska Scalable Exact Visualization of Isocontours in Road Networks via MinimumLink Paths. Journal of Computational Geometry 2018: 2773
Isocontours in road networks represent the area that is reachable from a source within a given resource limit. We study the problem of computing accurate isocontours in realistic, largescale networks. We propose isocontours represented by polygons with minimum number of segments that separate reachable and unreachable components of the network. Since the resulting problem is not known to be solvable in polynomial time, we introduce several heuristics that run in (almost) linear time and are simple enough to be implemented in practice. A key ingredient is a new practical lineartime algorithm for minimumlink paths in simple polygons. Experiments in a challenging realistic setting show excellent performance of our algorithms in practice, computing nearoptimal solutions in a few milliseconds on average, even for long ranges.

Rizzo, Manuel; Battaglia, Francesco Statistical and Computational Tradeoff in Genetic AlgorithmBased Estimation. Journal of Statistical Computation and Simulation 2018: 30813097
When a genetic algorithm (GA) is employed in a statistical problem, the result is affected by both variability due to sampling and the stochastic elements of algorithm. Both of these components should be controlled in order to obtain reliable results. In the present work we analyze parametric estimation problems tackled by GAs, and pursue two objectives: the first one is related to a formal variability analysis of final estimates, showing that it can be easily decomposed in the two sources of variability. In the second one we introduce a framework of GA estimation with fixed computational resources, which is a form of statistical and the computational tradeoff question, crucial in recent problems. In this situation the result should be optimal from both the statistical and computational point of view, considering the two sources of variability and the constraints on resources. Simulation studies will be presented for illustrating the proposed method and the statistical and computational tradeoff question.

Friedrich, Tobias; Krohmer, Anton On the diameter of hyperbolic random graphs. SIAM Journal on Discrete Mathematics 2018: 13141334
Large realworld networks are typically scalefree. Recent research has shown that such graphs are described best in a geometric space. More precisely, the internet can be mapped to a hyperbolic space such that geometric greedy routing is close to optimal (Boguñá, Papadopoulos, and Krioukov. Nature Communications, 1:62, 2010). This observation has pushed the interest in hyperbolic networks as a natural model for scalefree networks. Hyperbolic random graphs follow a power law degree distribution with controllable exponent \(\beta\) and show high clustering (Gugelmann, Panagiotou, and Peter. ICALP, pp. 573–585, 2012). For understanding the structure of the resulting graphs and for analyzing the behavior of network algorithms, the next question is bounding the size of the diameter. The only known explicit bound is \(O(\)\((\log n)\)\(^{32/((3  \beta)(5  \beta))+1})\)(Kiwi and Mitsche. ANALCO, pp. 26–39, 2015). We present two much simpler proofs for an improved upper bound of \(O((\log n)\)\(^{2/(3  \beta)})\) and a lower bound of \(\Omega(\log n)\). If \(\beta > 3\), we show that the latter bound is tight by proving an upper bound of \(O(\log n)\) for the diameter.

Friedrich, Tobias; Katzmann, Maximilian; Krohmer, Anton Unbounded Discrepancy of Deterministic Random Walks on Grids. SIAM Journal on Discrete Mathematics 2018: 24412452
Random walks are frequently used in randomized algorithms. We study a derandomized variant of a random walk on graphs, called rotorrouter model. In this model, instead of distributing tokens randomly, each vertex serves its neighbors in a fixed deterministic order. For most setups, both processes behave remarkably similar: Starting with the same initial configuration, the number of tokens in the rotorrouter model deviates only slightly from the expected number of tokens on the corresponding vertex in the random walk model. The maximal difference over all vertices and all times is called single vertex discrepancy. Cooper and Spencer (2006) showed that on \(\mathbb{Z}^{d}\) the single vertex discrepancy is only a constant \(c_d\). Other authors also determined the precise value of \(c_d\) for \(d=1,2\). All these results, however, assume that initially all tokens are only placed on one partition of the bipartite graph \(\mathbb{Z}^{d}\). We show that this assumption is crucial by proving that otherwise the single vertex discrepancy can become arbitrarily large. For all dimensions \(d\geq1\) and arbitrary discrepancies~\(\ell \geq 0\), we construct configurations that reach a discrepancy of at least \(\ell\).

Krejca, Martin S.; Witt, Carsten Lower Bounds on the Run Time of the Univariate Marginal Distribution Algorithm on OneMax. Theoretical Computer Science 2018
The Univariate Marginal Distribution Algorithm (UMDA)  a popular estimationofdistribution algorithm  is studied from a run time perspective. On the classical OneMax benchmark function on bit strings of length \(n\), a lower bound of \(\Omega(\lambda + \mu \sqrt{n} + n\log n)\), where \(\mu\) and \(\lambda\) are algorithmspecific parameters, on its expected run time is proved. This is the first direct lower bound on the run time of UMDA. It is stronger than the bounds that follow from general blackbox complexity theory and is matched by the run time of many evolutionary algorithms. The results are obtained through advanced analyses of the stochastic change of the frequencies of bit values maintained by the algorithm, including carefully designed potential functions. These techniques may prove useful in advancing the field of run time analysis for estimationofdistribution algorithms in general.

Bazgan, Cristina; Brankovic, Ljiljana; Casel, Katrin; Fernau, Henning; Jansen, Klaus; Klein, KimManuel; Lampis, Michael; Liedloff, Mathieu; Monnot, Jérôme; Paschos, Vangelis Th. The many facets of upper domination. Theoretical Computer Science 2018: 225
This paper studies Upper Domination, i.e., the problem of computing the maximum cardinality of a minimal dominating set in a graph with respect to classical and parameterised complexity as well as approximability.

Doskoč, Vanja Confident Iterative Learning in Computational Learning Theory. Current Trends in Theory and Practice of Computer Science (SOFSEM) 2018: 3042
In inductive inference various types of learning have emerged. The main aim of this paper is to investigate a new type of learning, the confident iterative learning. Given a class to be learnt, the idea here is to merge the following two concepts. For confidence, we require the learner to converge on any set, however, it only needs to be correct on the sets in the class. To be iterative, we restrict the learner’s memory on previ ous inputs and calculations to its last hypothesis. Investigating the new learner, we will provide negative and positive examples, as well as some properties the confident iterative learner possesses. This will peak at a classification theorem for certain types of classes. Next, we will introduce and compare different types of confidence, focus ing on the learner’s behaviour on sets outside of the class. Lastly, we will focus on the possible hypotheses. Introducing learning with respect to hypothesis spaces, we will provide examples witnessing that exact, class preserving and class comprising learning are different.

Battaglia, Francesco; Cucina, Domenico; Rizzo, Manuel Periodic Autoregressive Models with Multiple Structural Changes by Genetic Algorithms. Mathematical and Statistical Methods for Actuarial Sciences and Finance (MAF) 2018: 107110
We present a model and a computational procedure for dealing with seasonality and regime changes in time series. In this work we are interested in time series which in addition to trend display seasonality in mean, in autocorrelation and in variance. These type of series appears in many areas, including hydrology, meteorology, economics and finance. The seasonality is accounted for by subset PAR modelling, for which each season follows a possibly different Autoregressive model. Levels, trend, autoregressive parameters and residual variances are allowed to change their values at fixed unknown times. The identification of number and location of structural changes, as well as PAR lags indicators, is based on Genetic Algorithms, which are suitable because of high dimensionality of the discrete search space. An application to Italian industrial production index time series is also proposed.

Bläsius, Thomas; Friedrich, Tobias; Katzmann, Maximilian; Krohmer, Anton Hyperbolic Embeddings for NearOptimal Greedy Routing. Algorithm Engineering and Experiments (ALENEX) 2018: 199208
Greedy routing computes paths between nodes in a network by successively moving to the neighbor closest to the target with respect to coordinates given by an embedding into some metric space. Its advantage is that only local information is used for routing decisions. We present different algorithms for generating graph embeddings into the hyperbolic plane that are well suited for greedy routing. In particular our embeddings guarantee that greedy routing always succeeds in reaching the target and we try to minimize the lengths of the resulting greedy paths. We evaluate our algorithm on multiple generated and real wold networks. For networks that are generally assumed to have a hidden underlying hyperbolic geometry, such as the Internet graph, we achieve nearoptimal results, i.e., the resulting greedy paths are only slightly longer than the corresponding shortest paths. In the case of the Internet graph, they are only \(6\%\) longer when using our best algorithm, which greatly improves upon the previous best known embedding, whose creation required substantial manual intervention.

Friedrich, Tobias; Quinzan, Francesco; Wagner, Markus Escaping Large Deceptive Basins of Attraction with Heavy Mutation Operators. Genetic and Evolutionary Computation Conference (GECCO) 2018: 293300
In many Evolutionary Algorithms (EAs), a parameter that needs to be tuned is that of the mutation rate, which determines the probability for each decision variable to be mutated. Typically, this rate is set to 1/n for the duration of the optimization, where n is the number of decision variables. This setting has the appeal that the expected number of mutated variables per iteration is one. In a recent theoretical study, it was proposed to sample the number of mutated variables from a powerlaw distribution. This results into a significantly higher probability on larger numbers of mutations, so that escaping local optima becomes more probable. In this paper, we propose another class of nonuniform mutation rates. We study the benefits of this operator in terms of averagecase blackbox complexity analysis and experimental comparison. We consider both pseudoBoolean artificial landscapes and combinatorial problems (the Minimum Vertex Cover and the Maximum Cut). We observe that our nonuniform mutation rates significantly outperform the standard choices, when dealing with landscapes that exhibit large deceptive basins of attraction.

Friedrich, Tobias; Kötzing, Timo; Quinzan, Francesco; Sutton, Andrew M. Improving the Run Time of the (1+1) Evolutionary Algorithm with Luby Sequences. Genetic and Evolutionary Computation Conference (GECCO) 2018: 301308
In the context of black box optimization, the most common way to handle deceptive attractors is to periodically restart the algorithm. In this paper, we explore the benefits of combining the simple \((1+1)\) Evolutionary Algorithm (EA) with the Luby Universal Strategy  the \((1+1)~EA_{\mathcal{U}}\), a metaheuristic that does not require parameter tuning. We first consider two artificial pseudoBoolean landscapes, on which the \((1+1)~EA\) exhibits exponential run time. We prove that the \((1+1)~EA_{\mathcal{U}}\) has polynomial run time on both instances. We then consider the Minimum Vertex Cover on two classes of graphs. Again, the \((1+1)~EA\) yields exponential run time on those instances, and the \((1+1)~EA_{\mathcal{U}}\) finds the global optimum in polynomial time. We conclude by studying the Makespan Scheduling. We consider an instance on which the \((1+1)~EA\) does not find a \((4/3\epsilon)\)approximation in polynomial time, and we show that the \((1+1)~EA_{\mathcal{U}}\) reaches a \((4/3\epsilon)\)approximation in polynomial time. We then prove that the \((1+1)~EA_{\mathcal{U}}\) serves as an Efficient Polynomialtime Approximation Scheme (EPTAS) for the Partition Problem, for a \((1+\epsilon)\)approximation with \(\epsilon > 4/n\).

Gao, Wanru; Friedrich, Tobias; Neumann, Frank; Hercher, Christian Randomized Greedy Algorithms for Covering Problems. Genetic and Evolutionary Computation Conference (GECCO) 2018: 309315
Greedy algorithms provide a fast and often also effective solution to many combinatorial optimization problems. However, it is well known that they sometimes lead to low quality solutions on certain instances. In this paper, we explore the use of randomness in greedy algorithms for the minimum vertex cover and dominating set problem and compare the resulting performance against their deterministic counterpart. Our algorithms are based on a parameter \(\gamma\) which allows to explore the spectrum between uniform and deterministic greedy selection in the steps of the algorithm and our theoretical and experimental investigations point out the benefits of incorporating randomness into greedy algorithms for the two considered combinatorial optimization problems.

Doerr, Benjamin; Krejca, Martin S. Significancebased EstimationofDistribution Algorithms. Genetic and Evolutionary Computation Conference (GECCO) 2018: 14831490
Estimationofdistribution algorithms (EDAs) are randomized search heuristics that maintain a stochastic model of the solution space. This model is updated from iteration to iteration based on the quality of the solutions sampled according to the model. As previous works show, this shortterm perspective can lead to erratic updates of the model, in particular, to bitfrequencies approaching a random boundary value. This can lead to significant performance losses. In order to overcome this problem, we propose a new EDA that takes into account a longer history of samples and updates its model only with respect to information which it classifies as statistically significant. We prove that this significancebased compact genetic algorithm (sigcGA) optimizes the common benchmark functions OneMax and LeadingOnes both in \(O(n\log n)\) time, a result shown for no other EDA or evolutionary algorithm so far. For the recently proposed scGA – an EDA that tries to prevent erratic model updates by imposing a bias to the uniformly distributed model – we prove that it optimizes OneMax only in a time exponential in the hypothetical population size \(1/\rho\).

Bläsius, Thomas; Freiberger, Cedric; Friedrich, Tobias; Katzmann, Maximilian; MontenegroRetana, Felix; Thieffry, Marianne Efficient Shortest Paths in ScaleFree Networks with Underlying Hyperbolic Geometry. International Colloquium on Automata, Languages, and Programming (ICALP) 2018: 20:120:14
A common way to accelerate shortest path algorithms on graphs is the use of a bidirectional search, which simultaneously explores the graph from the start and the destination. It has been observed recently that this strategy performs particularly well on scalefree realworld networks. Such networks typically have a heterogeneous degree distribution (e.g., a powerlaw distribution) and high clustering (i.e., vertices with a common neighbor are likely to be connected themselves). These two properties can be obtained by assuming an underlying hyperbolic geometry. To explain the observed behavior of the bidirectional search, we analyze its running time on hyperbolic random graphs and prove that it is \(\tilde{O}(n\)\(^{2  1/ \alpha}+\)\(n^{1/(2\alpha)}\)\(+ \delta_{\max})\) with high probability, where \(\alpha\)\(\in\)\((0.5, 1)\) controls the powerlaw exponent of the degree distribution, and \(\delta_{\max}\) is the maximum degree. This bound is sublinear, improving the obvious worstcase linear bound. Although our analysis depends on the underlying geometry, the algorithm itself is oblivious to it.

Casel, Katrin Resolving Conflicts for LowerBounded Clustering. International Symposium on Parameterized and Exact Computation (IPEC) 2018
This paper considers the effect of nonmetric distances for lowerbounded clustering, i.e., the problem of computing a partition for a given set of objects with pairwise distance, such that each set has a certain minimum cardinality (as required for anonymisation or balanced facility location problems). We discuss lowerbounded clustering with the objective to minimise the maximum radius or diameter of the clusters. For these problems there exists a 2approximation but only if the pairwise distance on the objects satisfies the triangle inequality, without this property no polynomialtime constant factor approximation is possible. We try to resolve or at least soften this effect of nonmetric distances by devising particular strategies to deal with violations of the triangle inequality ("conflicts"). With parameterised algorithmics, we find that if the number of such conflicts is not too large, constant factor approximations can still be computed efficiently. In particular, we introduce parameterised approximations with respect to not just the number of conflicts but also for the vertex cover number of the "conflict graph" (graph induced by conflicts). Interestingly, we salvage the approximation ratio of 2 for diameter while for radius it is only possible to show a ratio of 3. For the parameter vertex cover number of the conflict graph this worsening in ratio is shown to be unavoidable, unless FPT=W[2]. We further discuss improvements for diameter by choosing the (induced) \(P_3\)cover number of the conflict graph as parameter and complement these by showing that, unless FPT=W[1], there exists no constant factor parameterised approximation with respect to the parameter split vertex deletion set.

Cucina, Domenico; Rizzo, Manuel; Ursu, Eugen Identification of Multiregime Periodic Autoregressive Models by Genetic Algorithms. International Conference on Time Series and Forecasting (ITISE) 2018: 396407
This paper develops a procedure for identifying multiregime Periodic AutoRegressive (PAR) models. In each regime a possibly dif ferent PAR model is built, for which changes can be due to the seasonal means, the autocorrelation structure or the variances. Number and lo cations of changepoints which subdivide the time span are detected by means of Genetic Algorithms (GAs), that optimize an identification cri terion. The method is evaluated by means of simulation studies, and is then employed to analyze shrimp fishery data.

Scheibel, Willy; Weyand, Christopher; Döllner, Jürgen EvoCells – A Treemap Layout Algorithm for Evolving Tree Data. 9th International Conference on Information Visualization Theory and Applications (IVAPP) 2018: 273280
We propose the rectangular treemap layout algorithm EvoCells that maps changes in treestructured data onto an initial treemap layout. Changes in topology and node weights are mapped to insertion, removal, growth, and shrinkage of the layout rectangles. Thereby, rectangles displace their neighbors and stretch their enclosing rectangles with a runtime complexity of O(nlogn). An evaluation using layout stability metrics on the opensource ElasticSearch software system suggests EvoCells as a valid alternative for stable treemap layouting.

Göbel, Andreas; Lagodzinski, J. A. Gregor; Seidel, Karen Counting Homomorphisms to Trees Modulo a Prime. International Symposium on Mathematical Foundations of Computer Science (MFCS) 2018: 49:149:13
Many important graph theoretic notions can be encoded as counting graph homomorphism problems, such as partition functions in statistical physics, in particular independent sets and colourings. In this article we study the complexity of~\($\#_p\textsc{HomsTo}H$\), the problem of counting graph homomorphisms from an input graph to a graph \($H$\) modulo a prime number~\($p$\). Dyer and Greenhill proved a dichotomy stating that the tractability of nonmodular counting graph homomorphisms depends on the structure of the target graph. Many intractable cases in nonmodular counting become tractable in modular counting due to the common phenomenon of cancellation. In subsequent studies on counting modulo~\($2$\), however, the influence of the structure of~\($H$\) on the tractability was shown to persist, which yields similar dichotomies. Our main result states that for every tree~\($H$\) and every prime~\($p$\) the problem \($\#_p\textsc{HomsTo}H$\) is either polynomial time computable or \($\#_p\mathsf{P}$\)complete. This relates to the conjecture of Faben and Jerrum stating that this dichotomy holds for every graph \($H$\) when counting modulo~2. In contrast to previous results on modular counting, the tractable cases of \($\#_p\textsc{HomsTo}H$\) are essentially the same for all values of the modulo when \($H$\) is a tree. To prove this result, we study the structural properties of a homomorphism. As an important interim result, our study yields a dichotomy for the problem of counting weighted independent sets in a bipartite graph modulo some prime~\($p$\). These results are the first suggesting that such dichotomies hold not only for the onebit functions of the modulo~2 case but also for the modular counting functions of all primes~\($p$\).

Kötzing, Timo; Lagodzinski, J. A. Gregor; Lengler, Johannes; Melnichenko, Anna Destructiveness of Lexicographic Parsimony Pressure and Alleviation by a Concatenation Crossover in Genetic Programming. Parallel Problem Solving From Nature (PPSN) 2018: 4254
For theoretical analyses there are two specifics distinguishing GP from many other areas of evolutionary computation. First, the variable size representations, in particular yielding a possible bloat (i.e. the growth of individuals with redundant parts). Second, the role and realization of crossover, which is particularly central in GP due to the treebased representation. Whereas some theoretical work on GP has studied the effects of bloat, crossover had a surprisingly little share in this work. We analyze a simple crossover operator in combination with local search,where a preference for small solutions minimizes bloat (lexicographic parsimony pressure); the resulting algorithm is denoted ConcatenationCrossover GP. For this purpose three variants of the wellstudied Majority test function with large plateaus are considered. We show that the Concatenation Crossover GP can efficiently optimize these test functions,while local search cannot be efficient for all three variants independent of employing bloat control.

Kötzing, Timo; Krejca, Martin S. FirstHitting Times for Finite State Spaces. Parallel Problem Solving From Nature (PPSN) 2018: 7991
One of the most important aspects of a randomized algorithm is bounding its expected run time on various problems. Formally speaking, this means bounding the expected firsthitting time of a random process. The two arguably most popular tools to do so are the fitness level method and drift theory. The fitness level method considers arbitrary transition probabilities but only allows the process to move toward the goal. On the other hand, drift theory allows the process to move into any direction as long as it moves closer to the goal in expectation; however, this tendency has to be monotone and, thus, the transition probabilities cannot be arbitrary. We provide a result that combines the benefit of these two approaches: our result gives a lower and an upper bound for the expected firsthitting time of a random process over \(\{0, \ldots ,n\}\) that is allowed to move forward and backward by 1 and can use arbitrary transition probabilities. In case that the transition probabilities are known, our bounds coincide and yield the exact value of the expected firsthitting time. Further, we also state the stationary distribution as well as the mixing time of a special case of our scenario.

Kötzing, Timo; Krejca, Martin S. FirstHitting Times Under Additive Drift. Parallel Problem Solving From Nature (PPSN) 2018: 92104
For the last ten years, almost every theoretical result concerning the expected run time of a randomized search heuristic used drift theory, making it the arguably most important tool in this domain. Its success is due to its ease of use and its powerful result: drift theory allows the user to derive bounds on the expected firsthitting time of a random process by bounding expected local changes of the process – the drift. This is usually far easier than bounding the expected firsthitting time directly. Due to the widespread use of drift theory, it is of utmost importance to have the best drift theorems possible. We improve the fundamental additive, multiplicative, and variable drift theorems by stating them in a form as general as possible and providing examples of why the restrictions we keep are still necessary. Our additive drift theorem for upper bounds only requires the process to be nonnegative, that is, we remove unnecessary restrictions like a finite, discrete, or bounded search space. As corollaries, the same is true for our upper bounds in the case of variable and multiplicative drift

Frahnow, Clemens; Kötzing, Timo Ring Migration Topology Helps Bypassing Local Optima. Parallel Problem Solving From Nature (PPSN) 2018: 129140
Running several evolutionary algorithms in parallel and occasionally exchanging good solutions is referred to as island models. The idea is that the independence of the different islands leads to diversity, thus possibly exploring the search space better. Many theoretical analyses so far have found a complete (or sufficiently quickly expanding) topology as underlying migration graph most efficient for optimization, even though a quick dissemination of individuals leads to a loss of diversity. We suggest a simple fitness function Fork with two local optima parametrized by \(r \geq 2\) and a scheme for composite fitness functions. We show that, while the (1+1) EA gets stuck in a bad local optimum and incurs a run time of \(\Theta(n^{2r})\) fitness evaluations on Fork, island models with a complete topology can achieve a run time of \(\Theta(n^{1.5r})\) by making use of rare migrations in order to explore the search space more effectively. Finally, the ring topology, making use of rare migrations and a large diameter, can achieve a run time of \(\tilde{\Theta}(n^r)\), the black box complexity of Fork. This shows that the ring topology can be preferable over the complete topology in order to maintain diversity.

Friedrich, Tobias; Göbel, Andreas; Quinzan, Francesco; Wagner, Markus Heavytailed Mutation Operators in SingleObjective Combinatorial Optimization. Parallel Problem Solving From Nature (PPSN) 2018: 134145
A core feature of evolutionary algorithms is their mutation operator. Recently, much attention has been devoted to the study of mutation operators with dynamic and nonuniform mutation rates. Following up on this line of work, we propose a new mutation operator and analyze its performance on the (1+1) Evolutionary Algorithm (EA). Our analyses show that this mutation operator competes with preexisting ones, when used by the (1+1)EA on classes of problems for which results on the other mutation operators are available. We present a jump function for which the performance of the (1+1)EA using any static uniform mutation and any restart strategy can be worse than the performance of the (1+1)EA using our mutation operator with no restarts. We show that the (1+1)EA using our mutation operator finds a (1/3)approximation ratio on any nonnegative submodular function in polynomial time. This performance matches that of combinatorial local search algorithms specifically designed to solve this problem. Finally, we evaluate experimentally the performance of the (1+1)EA using our operator, on realworld graphs of different origins with up to \(\sim\)37,000 vertices and \(\sim\)1.6 million edges. In comparison with uniform mutation and a recently proposed dynamic scheme our operator comes out on top on these instances.

Chauhan, Ankit; Lenzner, Pascal; Molitor, Louise Schelling Segregation with Strategic Agents. Symposium on Algorithmic Game Theory (SAGT) 2018
Schelling’s segregation model is a landmark model in sociology. It shows the counterintuitive phenomenon that residential segregation between individuals of different groups can emerge even when all involved individuals are tolerant. Although the model is widely studied, no pure gametheoretic version where rational agents strategically choose their location exists. We close this gap by introducing and analyzing generalized gametheoretic models of Schelling segregation, where the agents can also have individual location preferences. For our models we investigate the convergence behavior and the efficiency of their equilibria. In particular, we prove guaranteed convergence to an equilibrium in the version which is closest to Schelling’s original model. Moreover, we provide tight bounds on the Price of Anarchy.

Friedrich, Tobias; Rothenberger, Ralf Sharpness of the Satisfiability Threshold for NonUniform Random kSAT. Theory and Applications of Satisfiability Testing (SAT) 2018: 273291
Best Paper Award
We study nonuniform random kSAT on n variables with an arbitrary probability distribution p on the variable occurrences. The number \(t = t(n)\) of randomly drawn clauses at which random formulas go from asymptotically almost surely (a.a.s.) satisfiable to a.a.s. unsatisfiable is called the satisfiability threshold. Such a threshold is called sharp if it approaches a step function as n increases. We show that a threshold t(n) for random kSAT with an ensemble \((p_n)_{n\in\mathbb{N}}\) of arbitrary probability distributions on the variable occurrences is sharp if \(\p\_2^2 = O_n(t^{2/k})\) and \(\p\_infty = o_n(t^k/(2k1) \log^{(k1)/(2k1)(t))\). This result generalizes Friedgut’s sharpness result from uniform to nonuniform random kSAT and implies sharpness for thresholds of a wide range of random kSAT models with heterogeneous probability distributions, for example such models where the variable probabilities follow a powerlaw distribution.

Battaglia, Francesco; Cucina, Domenico; Rizzo, Manuel Generalized Periodic Autoregressive Models for Trend and Seasonality Varying Time Series. Scientific Meeting of the Italian Statistical Society (SIS) 2018
Many nonstationary time series exhibit changes in the trend and seasonality structure, that may be modeled by splitting the time axis into different regimes. We propose multiregime models where, inside each regime, the trend is linear and seasonality is explained by a Periodic Autoregressive model. In addition, for achieving parsimony, we allow season grouping, i.e. seasons may consist of one, two, or more consecutive observations. Identification is obtained by means of a Genetic Algorithm that minimizes an identification criterion.

Bläsius, Thomas; Eube, Jan; Feldtkeller, Thomas; Friedrich, Tobias; Krejca, Martin S.; Lagodzinski, J. A. Gregor; Rothenberger, Ralf; Severin, Julius; Sommer, Fabian; Trautmann, Justin Memoryrestricted Routing With Tiled Map Data. IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2018: 33473354
Modern routing algorithms reduce query time by depending heavily on preprocessed data. The recently developed Navigation Data Standard (NDS) enforces a separation between algorithms and map data, rendering preprocessing inapplicable. Furthermore, map data is partitioned into tiles with respect to their geographic coordinates. With the limited memory found in portable devices, the number of tiles loaded becomes the major factor for run time. We study routing under these restrictions and present new algorithms as well as empirical evaluations. Our results show that, on average, the most efficient algorithm presented uses more than 20 times fewer tile loads than a normal A*.

Bilò, Davide; Lenzner, Pascal On the Tree Conjecture for the Network Creation Game. Symposium on the Theoretical Aspects of Computer Science (STACS) 2018: 14:114:15
Selfish Network Creation focuses on modeling real world networks from a gametheoretic point of view. One of the classic models by Fabrikant et al. [PODC'03] is the network creation game, where agents correspond to nodes in a network which buy incident edges for the price of alpha per edge to minimize their total distance to all other nodes. The model is wellstudied but still has intriguing open problems. The most famous conjectures state that the price of anarchy is constant for all \($\alpha$\) and that for \($\alpha \geq n$\) all equilibrium networks are trees. We introduce a novel technique for analyzing stable networks for high edgeprice alpha and employ it to improve on the best known bounds for both conjectures. In particular we show that for \($\alpha > 4n13$\) all equilibrium networks must be trees, which implies a constant price of anarchy for this range of alpha. Moreover, we also improve the constant upper bound on the price of anarchy for equilibrium trees.

Bläsius, Thomas; Friedrich, Tobias; Katzmann, Maximilian; Krohmer, Anton; Striebel, Jonathan Towards a Systematic Evaluation of Generative Network Models. Workshop on Algorithms and Models for the Web Graph (WAW) 2018: 99114
Generative graph models play an important role in network science. Unlike realworld networks, they are accessible for mathematical analysis and the number of available networks is not limited. The explanatory power of results on generative models, however, heavily depends on how realistic they are. We present a framework that allows for a systematic evaluation of generative network models. It is based on the question whether realworld networks can be distinguished from generated graphs with respect to certain graph parameters. As a proof of concept, we apply our framework to four popular random graph models (ErdősRényi, BarabásiAlbert, ChungLu, and hyperbolic random graphs). Our experiments for example show that all four models are bad representations for Facebook's social networks, while ChungLu and hyperbolic random graphs are good representations for other networks, with different strengths and weaknesses.

Neubert, Stefan Mechanisms for Network Creation. Master Thesis, Hasso Plattner Institute 2018
We introduce and analyze the possibilities of a new model for network creation by autonomous selfish agents: Unlike in typical network formation games such as the wellknown model of Fabrikant et al. [PODC'03], the final network is not directly shaped by the players of a game. Instead, we design mechanisms that take edge preferences of agents as input for a social choice function and return a network that respects those preferences. In addition, it caters for compliance with global restrictions on the network topology and tries to establish several properties, such as global efficiency, maximizing the individual utility of agents, and building stable networks, especially in comparison to the result of an anarchic network formation. The mechanism also aims to produce Nash equilibria and encourages agents to honestly declare their preferences instead of playing strategically. The mechanism approach is a true superset of both centralized network design and uncoordinated network creation games. To the best of our knowledge this is the first attempt to explore the realm inbetween those extremes.

Fischbeck, Philipp On the Effectiveness of Data Reduction for Covering Problems in RealWorld Transit Networks. Master Thesis, Hasso Plattner Institute 2018
Given a graph and a set of paths, we want to find the minimal set of vertices such that each path is covered by at least one chosen vertex. Although this problem is NPhard, realworld instances can be solved almost completely by a set of simple reduction rules. We examine this behavior from a theoretical and empirical perspective. First, we show that the problem is easy to solve for forests and cycle graphs. However, the problem is NPhard for a feedback vertex number of 2 and a treewidth of 3. This indicates that the explanation for the effectiveness does not lie in the graph representation of problem instances. Thus, we examine the Hitting Set problem that arises when ignoring the graph representation and interpreting a path as a mere set of vertices. Through this relation, we show that the problem remains NPhard even for very strong restrictions. Hitting Set instances that have a representation as a path graph can be recognized as such in polynomial time. However, finding the graph representation with the fewest edges is NPhard. Based on the analysis of publicly available transit datasets, we show that the realworld instances are clustered and have heterogeneous stations, with the number of lines per station distributed according to a power law. We describe a model to generate random problem instances with adjustable clustering and heterogeneity. We use this model to show that while the heterogeneity does positively influence the effectiveness of the reduction rules, the largest effect comes from the clustering. Lastly, we show a strong relation between the reduction rules for the Hitting Set problem and reduction rules for the Independent Set problem on the intersection graph of the family of sets. We prove that the size of any independent set is a lower bound on the size of the maximum hitting set and show that the two bounds are very close for realworld instances. We show that the reduction rules need to be effective for Independent Set in order for them to be effective for Hitting Set.

Rizzo, Manuel Contributions on Evolutionary Computation for Statistical Inference. Doctoral Dissertation, Sapienza University of Rome 2018

Anand, S.; Bringmann, Karl; Friedrich, Tobias; Garg, Naveen; Kumar, Amit Minimizing Maximum (Weighted) FlowTime on Related and Unrelated Machines. Algorithmica 2017: 515536
In this paper we initiate the study of job scheduling on related and unrelated machines so as to minimize the maximum flow time or the maximum weighted flow time (when each job has an associated weight). Previous work for these metrics considered only the setting of parallel machines, while previous work for scheduling on unrelated machines only considered \(L_p, p < \infty\) norms. Our main results are: (1) we give an \(O(\epsilon^{3})\)competitive algorithm to minimize maximum weighted flow time on related machines where we assume that the machines of the online algorithm can process \(1+\epsilon\) units of a job in 1 timeunit (\(\epsilon\) speed augmentation). (2) For the objective of minimizing maximum flow time on unrelated machines we give a simple \(2/\epsilon\)competitive algorithm when we augment the speed by \(\epsilon\). For \(m\) machines we show a lower bound of \(\Omega(m)\) on the competitive ratio if speed augmentation is not permitted. Our algorithm does not assign jobs to machines as soon as they arrive. To justify this "drawback" we show a lower bound of \(\Omega(\log m)\) on the competitive ratio of immediate dispatch algorithms. In both these lower bound constructions we use jobs whose processing times are in \(\{1,\infty\}\), and hence they apply to the more restrictive subset parallel setting. (3) For the objective of minimizing maximum weighted flow time on unrelated machines we establish a lower bound of \(\Omega(\log m)\)on the competitive ratio of any online algorithm which is permitted to use \(s = O(1)\) speed machines. In our lower bound construction, job \(j\) has a processing time of \(p_j\) on a subset of machines and infinity on others and has a weight \(1/p_j\). Hence this lower bound applies to the subset parallel setting for the special case of minimizing maximum stretch.

Doerr, Benjamin; Neumann, Frank; Sutton, Andrew M. Time Complexity Analysis of Evolutionary Algorithms on Random Satisfiable kCNF Formulas. Algorithmica 2017: 561586
We contribute to the theoretical understanding of randomized search heuristics by investigating their optimization behavior on satisfiable random ksatisfiability instances both in the planted solution model and the uniform model conditional on satisfiability. Denoting the number of variables by n, our main technical result is that the simple (1+1) evolutionary algorithm with high probability finds a satisfying assignment in time \(O(n \log n)\) when the clausevariable density is at least logarithmic. For low density instances, evolutionary algorithms seem to be less effective, and all we can show is a subexponential upper bound on the runtime for densities below \(1/(k(k1))\). We complement these mathematical results with numerical experiments on a broader density spectrum. They indicate that, indeed, the (1+1) EA is less efficient on lower densities. Our experiments also suggest that the implicit constants hidden in our main runtime guarantee are low. Our main result extends and considerably improves the result obtained by Sutton and Neumann (Lect Notes Comput Sci 8672:942951, 2014) in terms of runtime, minimum density, and clause length. These improvements are made possible by establishing a close fitnessdistance correlation in certain parts of the search space. This approach might be of independent interest and could be useful for other averagecase analyses of randomized search heuristics. While the notion of a fitnessdistance correlation has been around for a long time, to the best of our knowledge, this is the first time that fitnessdistance correlation is explicitly used to rigorously prove a performance statement for an evolutionary algorithm.

Bampas, Evangelos; Göbel, AndreasNikolas; Pagourtzis, Aris; Tentes, Aris On the connection between interval size functions and path counting. Computational Complexity 2017: 421467
We investigate the complexity of hard (\#Pcomplete) counting problems that have easy decision version. By ‘easy decision,’ we mean that deciding whether the result of counting is nonzero is in P. This property is shared by several wellknown problems, such as counting the number of perfect matchings in a given graph or counting the number of satisfying assignments of a given DNF formula. We focus on classes of such hardtocount easytodecide problems which emerged through two seemingly disparate approaches: one taken by Hemaspaandra et al. (SIAM J Comput 36(5):1264–1300, 2007), who defined classes of functions that count the size of intervals of ordered strings, and one followed by Kiayias et al. (Lect Notes Comput Sci 2563:453–463, 2001), who defined the classTotP, consisting of functions that count the total number of paths of NP computations. We provide inclusion and separation relations between TotP and interval size counting classes, by means of new classes that we define in this work. Our results imply that many known #Pcomplete problems with easy decision are contained in the classes defined by Hemaspaandra et al., but are unlikely to be complete for these classes under reductions under which these classes are downward closed, e.g., parsimonious reductions. This, applied to the #MONSAT problem, partially answers an open question of Hemaspaandra et al. We also define a new class of interval size functions which strictly contains FP and is strictly contained in TotP under reasonable complexitytheoretic assumptions. We show that this new class contains hard counting problems.

Rizzo, Manuel On Variability Analysis of Evolutionary AlgorithmBased Estimation. Conference of the Classification and Data Analysis Group (CLADAG) 2017: 237242
A theoretical framework to analyze variability of parametric estimates obtained via Evolutionary Algorithms (EAs) is proposed. The nature of EAs, in fact, introduces a further source of variability, due to stochastic elements of the procedure. A simulation study employing Genetic Algorithms and Differential Evolution is also conducted in oder to make comments on the effect of these stochastic elements on variability.

Friedrich, Tobias; Kötzing, Timo; Krejca, Martin S.; Sutton, Andrew M. The Compact Genetic Algorithm is Efficient under Extreme Gaussian Noise. IEEE Transactions on Evolutionary Computation 2017: 477490
Practical optimization problems frequently include uncertainty about the quality measure, for example due to noisy evaluations. Thus, they do not allow for a straightforward application of traditional optimization techniques. In these settings, randomized search heuristics such as evolutionary algorithms are a popular choice because they are often assumed to exhibit some kind of resistance to noise. Empirical evidence suggests that some algorithms, such as estimation of distribution algorithms (EDAs) are robust against a scaling of the noise intensity, even without resorting to explicit noisehandling techniques such as resampling. In this paper, we want to support such claims with mathematical rigor. We introduce the concept of graceful scaling in which the run time of an algorithm scales polynomially with noise intensity. We study a monotone fitness function over binary strings with additive noise taken from a Gaussian distribution. We show that myopic heuristics cannot efficiently optimize the function under arbitrarily intense noise without any explicit noisehandling. Furthermore, we prove that using a population does not help. Finally we show that a simple EDA called the compact Genetic Algorithm can overcome the shortsightedness of mutationonly heuristics to scale gracefully with noise. We conjecture that recombinative genetic algorithms also have this property.

Galanis, Andreas; Göbel, Andreas; Goldberg, Leslie Ann; Lapinskas, John; Richerby, David Amplifiers for the Moran Process. Journal of the ACM 2017: 5:15:90
The Moran process, as studied by Lieberman, Hauert, and Nowak, is a randomised algorithm modelling the spread of genetic mutations in populations. The algorithm runs on an underlying graph where individuals correspond to vertices. Initially, one vertex (chosen uniformly at random) possesses a mutation, with fitness \(r > 1\). All other individuals have fitness 1. During each step of the algorithm, an individual is chosen with probability proportional to its fitness, and its state (mutant or nonmutant) is passed on to an outneighbour which is chosen uniformly at random. If the underlying graph is strongly connected, then the algorithm will eventually reach fixation, in which all individuals are mutants, or extinction, in which no individuals are mutants. An infinite family of directed graphs is said to be strongly amplifying if, for every \(r > 1\), the extinction probability tends to 0 as the number of vertices increases. A formal definition is provided in the article. Strong amplification is a rather surprising propertyit means that in such graphs, the fixation probability of a uniformly placed initial mutant tends to 1 even though the initial mutant only has a fixed selective advantage of \(r > 1\) (independently of \(n\)). The name "strongly amplifying" comes from the fact that this selective advantage is "amplified." Strong amplifiers have received quite a bit of attention, and Lieberman et al. proposed two potentially strongly amplifying familiessuperstars and metafunnels. Heuristic arguments have been published, arguing that there are infinite families of superstars that are strongly amplifying. The same has been claimed for metafunnels. In this article, we give the first rigorous proof that there is an infinite family of directed graphs that is strongly amplifying. We call the graphs in the family "megastars." When the algorithm is run on an \(n\)vertex graph in this family, starting with a uniformly chosen mutant, the extinction probability is roughly \(n  1/2\) (up to logarithmic factors). We prove that all infinite families of superstars and metafunnels have larger extinction probabilities (as a function of \(n\)). Finally, we prove that our analysis of megastars is fairly tight  there is no infinite family of megastars such that the Moran algorithm gives a smaller extinction probability (up to logarithmic factors). Also, we provide a counterexample which clarifies the literature concerning the isothermal theorem of Lieberman et al.

Boockmeyer, Arne; Fischbeck, Philipp; Neubert, Stefan Fit fürs Studium  Informatik. 2017 Rheinwerk Computing.

Seidel, Karen Zu mathematischen Argumentationen eines Experten aus einer semiotischen Perspektive. Beiträge zum Mathematikunterricht 2017 2017: 897900
In Argumentationen unter in der Forschung tätigen Mathematikern kombiniert die erklärende Person Gesten, Sprache und Zeichen und verleiht damit unter anderem den concept images der verwendeten mathematischen Begriffe Ausdruck. Im Folgenden wird der Frage nachgegangen, inwiefern eine multimodale Analyse von erklärenden Phasen in Argumentationen von Experten mit dem Fokus auf Begriffsbildungsprozessen Aufschluss geben kann über operationale und strukturelle Vorstellungen von Begriffen (Sfard, 1991).

Friedrich, Tobias; Kötzing, Timo; Wagner, Markus A Generic BetandRun Strategy for Speeding Up Stochastic Local Search. Conference on Artificial Intelligence (AAAI) 2017: 801807
A common strategy for improving optimization algorithms is to restart the algorithm when it is believed to be trapped in an inferior part of the search space. However, while specific restart strategies have been developed for specific problems (and specific algorithms), restarts are typically not regarded as a general tool to speed up an optimization algorithm. In fact, many optimization algorithms do not employ restarts at all. Recently, "betandrun" was introduced in the context of mixedinteger programming, where first a number of short runs with randomized initial conditions is made, and then the most promising run of these is continued. In this article, we consider two classical NPcomplete combinatorial optimization problems, traveling salesperson and minimum vertex cover, and study the effectiveness of different betandrun strategies. In particular, our restart strategies do not take any problem knowledge into account, nor are tailored to the optimization algorithm. Therefore, they can be used offtheshelf. We observe that stateoftheart solvers for these problems can benefit significantly from restarts on standard benchmark instances.

Katzmann, Maximilian; Komusiewicz, Christian Systematic Exploration of Larger Local Search Neighborhoods for the Minimum Vertex Cover Problem. Conference on Artificial Intelligence (AAAI) 2017: 846852
We investigate the potential of exhaustively exploring larger neighborhoods in local search algorithms for Minimum Vertex Cover. More precisely, we study whether, for moderate values of \(k\), it is feasible and worthwhile to determine, given a graph \(G\) with vertex cover \(C\), if there is a \(k\)swap \(S\) such that \((C \setminus S) cup (S \setminus C)\) is a smaller vertex cover of \(G\). First, we describe an algorithm running in \(\Delta^{O(k) \cdot n\) time for searching the \(k\)swap neighborhood on \(n\)vertex graphs with maximum degree \(\Delta\). Then, we demonstrate that, by devising additional pruning rules that decrease the size of the search space, this algorithm can be implemented so that it solves the problem quickly for \(k \approx 20\). Finally, we show that it is worthwhile to consider moderatelysized \(k\)swap neighborhoods. For our benchmark data set, we show that when combining our algorithm with a hillclimbing approach, the solution quality improves quickly with the radius \(k\) of the local search neighborhood and that in most cases optimal solutions can be found by setting \(k = 21\).

Friedrich, Tobias; Krohmer, Anton; Rothenberger, Ralf; Sutton, Andrew M. Phase Transitions for ScaleFree SAT Formulas. Conference on Artificial Intelligence (AAAI) 2017: 38933899
Recently, a number of nonuniform random satisfiability models have been proposed that are closer to practical satisfiability problems in some characteristics. In contrast to uniform random Boolean formulas, scalefree formulas have a variable occurrence distribution that follows a power law. It has been conjectured that such a distribution is a more accurate model for some industrial instances than the uniform random model. Though it seems that there is already an awareness of a threshold phenomenon in such models, there is still a complete picture lacking. In contrast to the uniform model, the critical density threshold does not lie at a single point, but instead exhibits a functional dependency on the powerlaw exponent. For scalefree formulas with clauses of length \(k = 2\), we give a lower bound on the phase transition threshold as a function of the scaling parameter. We also perform computational studies that suggest our bound is tight and investigate the critical density for formulas with higher clause lengths. Similar to the uniform model, on formulas with \(k \geq 3\), we find that the phase transition regime corresponds to a set of formulas that are difficult to solve by backtracking search.

Friedrich, Tobias; Neumann, Frank What's Hot in Evolutionary Computation. Conference on Artificial Intelligence (AAAI) 2017: 50645066
We provide a brief overview on some hot topics in the area of evolutionary computation. Our main focus is on recent developments in the areas of combinatorial optimization and realworld applications. Furthermore, we highlight recent progress on the theoretical understanding of evolutionary computing methods.

Hölzl, Rupert; Jain, Sanjay; Schlicht, Philipp; Seidel, Karen; Stephan, Frank Automatic Learning from Repetitive Texts. Algorithmic Learning Theory (ALT) 2017: 129150
We study the connections between the learnability of automatic families of languages and the types of text used to present them to a learner. More precisely, we study how restrictions on the number of times that a correct datum appears in a text influence what classes of languages are automatically learnable. We show that an automatic family of languages is automatically learnable from fat text iff it is automatically learnable from thick text iff it is verifiable from balanced text iff it satisfies Angluin's telltale condition. Furthermore, many automatic families are automatically learnable from exponential text. We also study the relationship between automatic learnability and verifiability and show that all automatic families are automatically partially verifiable from exponential text and automatically learnable from thick text.

Kötzing, Timo; Schirneck, Martin; Seidel, Karen Normal Forms in Semantic Language Identification. Algorithmic Learning Theory (ALT) 2017: 493516
We consider language learning in the limit from text where all learning restrictions are semantic, that is, where any conjecture may be replaced by a semantically equivalent conjecture. For different such learning criteria, starting with the wellknown TxtGBclearning, we consider three different normal forms: strongly locking learning, consistent learning and (partially) setdriven learning. These normal forms support and simplify proofs and give insight into what behaviors are necessary for successful learning (for example when consistency in conservative learning implies cautiousness and strong decisiveness). We show that strongly locking learning can be assumed for partially setdriven learners, even when learning restrictions apply. We give a very general proof relying only on a natural property of the learning restriction, namely, allowing for simulation on equivalent text. Furthermore, when no restrictions apply, also the converse is true: every strongly locking learner can be made partially setdriven. For several semantic learning criteria we show that learning can be done consistently. Finally, we deduce for which learning restrictions partial setdrivenness and setdrivenness coincide, including a general statement about classes of infinite languages. The latter again relies on a simulation argument.

Gao, Wanru; Friedrich, Tobias; Kötzing, Timo; Neumann, Frank Scaling up Local Search for Minimum Vertex Cover in Large Graphs by Parallel Kernelization. Australasian Conference on Artificial Intelligence (AUSAI) 2017: 131143
We investigate how wellperforming local search algorithms for small or medium size instances can be scaled up to perform well for large inputs. We introduce a parallel kernelization technique that is motivated by the assumption that graphs in medium to large scale are composed of components which are on their own easy for stateoftheart solvers but when hidden in large graphs are hard to solve. To show the effectiveness of our kernelization technique, we consider the wellknown minimum vertex cover problem and two stateoftheart solvers called NuMVC and FastVC. Our kernelization approach reduces an existing large problem instance significantly and produces better quality results on a wide range of benchmark instances and real world graphs.

Wagner, Markus; Friedrich, Tobias; Lindauer, Marius Improving local search in a minimum vertex cover solver for classes of networks. Congress on Evolutionary Computation (CEC) 2017: 17041711
For the minimum vertex cover problem, a wide range of solvers has been proposed over the years. Most classical exact approaches are encountering run time issues on massive graphs that are considered nowadays. A straightforward alternative approach is then to use heuristics, which make assumptions about the structure of the studied graphs. These assumptions are typically hardcoded and are hoped to work well for a wide range of networks  which is in conflict with the nature of broad benchmark sets. With this article, we contribute in two ways. First, we identify a component in an existing solver that influences its performance depending on the class of graphs, and we then customize instances of this solver for different classes of graphs. Second, we create the first algorithm portfolio for the minimum vertex cover to further improve the performance of a single integrated approach to the minimum vertex cover problem.

Kovacs, Robert; Seufert, Anna; Wall, Ludwig; Chen, HsiangTing; Meinel, Florian; Müller, Willi; You, Sijing; Brehm, Maximilian; Striebel, Jonathan; Kommana, Yannis; Popiak, Alexander; Bläsius, Thomas; Baudisch, Patrick TrussFab: Fabricating Sturdy LargeScale Structures on Desktop 3D Printers. Human Factors in Computing Systems (CHI) 2017: 26062616
We present TrussFab, an integrated endtoend system that allows users to fabricate large scale structures that are sturdy enough to carry human weight. TrussFab achieves the large scale by complementing 3D print with plastic bottles. It does not use these bottles as "bricks" though, but as beams that form structurally sound nodelink structures, also known as trusses, allowing it to handle the forces resulting from scale and load. TrussFab embodies the required engineering knowledge, allowing nonengineers to design such structures and to validate their design using integrated structural analysis. We have used TrussFab to design and fabricate tables and chairs, a 2.5 m long bridge strong enough to carry a human, a functional boat that seats two, and a 5 m diameter dome.

Friedrich, Tobias; Krohmer, Anton; Rothenberger, Ralf; Sauerwald, Thomas; Sutton, Andrew M. Bounds on the Satisfiability Threshold for Power Law Distributed Random SAT. European Symposium on Algorithms (ESA) 2017: 37:137:15
Propositional satisfiability (SAT) is one of the most fundamental problems in computer science. The worstcase hardness of SAT lies at the core of computational complexity theory. The averagecase analysis of SAT has triggered the development of sophisticated rigorous and nonrigorous techniques for analyzing random structures. Despite a long line of research and substantial progress, nearly all theoretical work on random SAT assumes a uniform distribution on the variables. In contrast, realworld instances often exhibit large fluctuations in variable occurrence. This can be modeled by a scalefree distribution of the variables, which results in distributions closer to industrial SAT instances. We study random \(k\)SAT on \(n\) variables, \(m=\Theta(n)\) clauses, and a power law distribution on the variable occurrences with exponent \(\beta\). We observe a satisfiability threshold at \(\beta=(2k1)/(k1)\). This threshold is tight in the sense that instances with \(beta < (2k1)/(k1)\varepsilon\) for any constant \(\varepsilon>0\) are unsatisfiable with high probability (w.h.p.). For \(\beta\ge(2k1)/(k1)+\varepsilon\), the picture is reminiscent of the uniform case: instances are satisfiable w.h.p. for sufficiently small constant clausevariable ratios \(m/n\); they are unsatisfiable above a ratio \(m/n\) that depends on \(\beta\).

Friedrich, Tobias; Kötzing, Timo; Quinzan, Francesco; Sutton, Andrew Michael Resampling vs Recombination: a Statistical Run Time Estimation. Foundations of Genetic Algorithms (FOGA) 2017: 2535
Noise is pervasive in realworld optimization, but there is still little understanding of the interplay between the operators of randomized search heuristics and explicit noisehandling techniques, such as statistical resampling. In this paper, we report on several statistical models and theoretical results that help to clarify this reciprocal relationship for a collection of randomized search heuristics on noisy functions. We consider the optimization of pseudoBoolean functions under additive posterior Gaussian noise and explore the tradeo between noise reduction and the computational cost of resampling. We first perform experiments to find the optimal parameters at a given noise intensity for a mutationonly evolutionary algorithm, a genetic algorithm employing recombination, an estimation of distribution algorithm (EDA), and an ant colony optimization algorithm. We then observe how the optimal parameter depends on the noise intensity for the different algorithms. Finally, we locate the point where statistical resampling costs more than it is worth in terms of run time. We find that the EA requires the highest number of resamples to obtain the best speedup, whereas crossover reduces both the run time and the number of resamples required. Most surprisingly, we find that EDAlike algorithms require no resampling, and can handle noise implicitly.

Pourhassan, Mojgan; Friedrich, Tobias; Neumann, Frank On the Use of the Dual Formulation for Minimum Weighted Vertex Cover in Evolutionary Algorithms. Foundations of Genetic Algorithms (FOGA) 2017: 3744
We consider the weighted minimum vertex cover problem and investigate how its dual formulation can be exploited to design evolutionary algorithms that provably obtain a 2approximation. Investigating multivalued representations, we show that variants of randomized local search and the (1+1) EA achieve this goal in expected pseudopolynomial time. In order to speed up the process, we consider the use of step size adaptation in both algorithms and show that RLS obtains a 2approximation in expected polynomial time while the (1+1) EA still encounters a pseudopolynomial lower bound.

Friedrich, Tobias; Kötzing, Timo; Lagodzinski, J. A. Gregor; Neumann, Frank; Schirneck, Martin Analysis of the (1+1) EA on Subclasses of Linear Functions under Uniform and Linear Constraints. Foundations of Genetic Algorithms (FOGA) 2017: 4554
Linear functions have gained a lot of attention in the area of run time analysis of evolutionary computation methods and the corresponding analyses have provided many effective tools for analyzing more complex problems. In this paper, we consider the behavior of the classical (1+1) Evolutionary Algorithm for linear functions under linear constraint. We show tight bounds in the case where both the objective function and the constraint is given by the OneMax function and present upper bounds as well as lower bounds for the general case. Furthermore, we also consider the LeadingOnes fitness function.

Krejca, Martin S.; Witt, Carsten Lower Bounds on the Run Time of the Univariate Marginal Distribution Algorithm on OneMax. Foundations of Genetic Algorithms (FOGA) 2017: 6579
The Univariate Marginal Distribution Algorithm (UMDA), a popular estimation of distribution algorithm, is studied from a run time perspective. On the classical OneMax benchmark function, a lower bound of \(Omega (\mu \sqrt n + n \log n)\), where \(\mu\) is the population size, on its expected run time is proved. This is the first direct lower bound on the run time of the UMDA. It is stronger than the bounds that follow from general blackbox complexity theory and is matched by the run time of many evolutionary algorithms. The results are obtained through advanced analyses of the stochastic change of the frequencies of bit values maintained by the algorithm, including carefully designed potential functions. These techniques may prove useful in advancing the field of run time analysis for estimation of distribution algorithms in general.

Chauhan, Ankit; Friedrich, Tobias; Quinzan, Francesco Approximating Optimization Problems using EAs on ScaleFree Networks. Genetic and Evolutionary Computation Conference (GECCO) 2017: 235242
It has been experimentally observed that realworld networks follow certain topologicalproperties, such as smallworld, powerlaw etc. To study these networks, many random graph models, such as Preferential Attachment, have been proposed. In this paper, we consider the deterministic properties which capture powerlaw degree distribution and degeneracy. Networks with these properties are known as scalefree networks in the literature. Many interesting problems remain NPhard on scalefree networks. We study the relationship between scalefree properties and the approximationratio of some commonly used evolutionary algorithms. For the Vertex Cover, we observe experimentally that the \((1+1)\) EA always gives the better result than a greedy local search, even when it runs for only \(O(n, \log(n))\) steps. We give the construction of a scalefree network in which a multiobjective algorithm and a greedy algorithm obtain optimal solutions, while the \((1+1)\) EA obtains the worst possible solution with constant probability. We prove that for the Dominating Set, Vertex Cover, Connected Dominating Set and Independent Set, the \((1+1)\) EA obtains constantfactor approximation in expected run time \(O(n, \log(n))\) and \(O(n^4)\) respectively. Whereas, GSEMO gives even better approximation than \((1+1)\) EA in expected run time \(O(n^3)\) for Dominating Set, Vertex Cover and Connected Dominating Set on such networks.

Friedrich, Tobias; Kötzing, Timo; Melnichenko, Anna Analyzing Search Heuristics with Differential Equations. Genetic and Evolutionary Computation Conference (GECCO) 2017: 313314
Drift Theory is currently the most common technique for the analysis of randomized search heuristics because of its broad applicability and the resulting tight first hitting time bounds. The biggest problem when applying a drift theorem is to find a suitable potential function which maps a complex space into a single number, capturing the essence of the state of the search in just one value. We discuss another method for the analysis of randomized search heuristics based on the Theory of Differential Equations. This method considers the deterministic counterpart of the randomized process by replacing probabilistic outcomes by their expectation, and then bounding the error with good probability. We illustrate this by analyzing an Ant Colony Optimization algorithm (ACO) for the Minimum Spanning Tree problem (MST).

Doerr, Benjamin; Kötzing, Timo; Lagodzinski, J. A. Gregor; Lengler, Johannes Bounding Bloat in Genetic Programming. Genetic and Evolutionary Computation Conference (GECCO) 2017: 921928
While many optimization problems work with a fixed number of decision variables and thus a fixedlength representation of possible solutions, genetic programming (GP) works on variablelength representations. A naturally occurring problem is that of bloat (unnecessary growth of solutions) slowing down optimization. Theoretical analyses could so far not bound bloat and required explicit assumptions on the magnitude of bloat. In this paper we analyze bloat in mutationbased genetic programming for the two test functions ORDER and MAJORITY. We overcome previous assumptions on the magnitude of bloat and give matching or closetomatching upper and lower bounds for the expected optimization time. In particular, we show that the \((1+1)\) GP takes (i) \(\Theta(T_\text{init + n\, \log n)\) iterations with bloat control on ORDER as well as MAJORITY; and (ii) \(O(T_{\text{init ,log \,T_{\text{init + n(\log n)^3)\) and \(\Omega(T_\text{init + n \,\log n)\) (and \(\Omega(T_\text{init \,\log \,T_{\text{init}})\) for \(n = 1\)) iterations without bloat control on MAJORITY.
