
Doerr, Benjamin; Neumann, Frank; Sutton, Andrew M. Time Complexity Analysis of Evolutionary Algorithms on Random Satisfiable kCNF Formulas. Algorithmica 2017: 561586
We contribute to the theoretical understanding of randomized search heuristics by investigating their optimization behavior on satisfiable random ksatisfiability instances both in the planted solution model and the uniform model conditional on satisfiability. Denoting the number of variables by n, our main technical result is that the simple (1+1) evolutionary algorithm with high probability finds a satisfying assignment in time \(O(n \log n)\) when the clausevariable density is at least logarithmic. For low density instances, evolutionary algorithms seem to be less effective, and all we can show is a subexponential upper bound on the runtime for densities below \(1/(k(k1))\). We complement these mathematical results with numerical experiments on a broader density spectrum. They indicate that, indeed, the (1+1) EA is less efficient on lower densities. Our experiments also suggest that the implicit constants hidden in our main runtime guarantee are low. Our main result extends and considerably improves the result obtained by Sutton and Neumann (Lect Notes Comput Sci 8672:942951, 2014) in terms of runtime, minimum density, and clause length. These improvements are made possible by establishing a close fitnessdistance correlation in certain parts of the search space. This approach might be of independent interest and could be useful for other averagecase analyses of randomized search heuristics. While the notion of a fitnessdistance correlation has been around for a long time, to the best of our knowledge, this is the first time that fitnessdistance correlation is explicitly used to rigorously prove a performance statement for an evolutionary algorithm.

Dang, DucCuong; Friedrich, Tobias; Kötzing, Timo; Krejca, Martin S.; Lehre, Per Kristian; Oliveto, Pietro S.; Sudholt, Dirk; Sutton, Andrew M. Escaping Local Optima Using Crossover with Emergent Diversity. IEEE Transactions on Evolutionary Computation 2017
Population diversity is essential for the effective use of any crossover operator. We compare seven commonly used diversity mechanisms and prove rigorous run time bounds for the \((\mu+1)\) GA using uniform crossover on the fitness function \(Jump_k\). All previous results in this context only hold for unrealistically low crossover probability \(p_c=O(k/n)\), while we give analyses for the setting of constant \(p_c < 1\) in all but one case. Our bounds show a dependence on the problem size \(n\), the jump length \(k\), the population size \(\mu\), and the crossover probability \(p_c\). For the typical case of constant \(k > 2\) and constant \(p_c\), we can compare the resulting expected optimisation times for different diversity mechanisms assuming an optimal choice of \(\mu\): \(O(n^{k1})\) for duplicate elimination/minimisation, \(O(n^2 \log n)\) for maximising the convex hull, \(O(n \log n)\) for det. crowding (assuming \(p_c = k/n\)), \(O(n \log n)\) for maximising the Hamming distance, \(O(n \log n)\) for fitness sharing, \(O(n \log n)\) for the singlereceiver island model. This proves a sizeable advantage of all variants of the \((\mu+1)\) GA compared to the (1+1) EA, which requires \(\Theta(n^k)\). In a short empirical study we confirm that the asymptotic differences can also be observed experimentally.

Friedrich, Tobias; Kötzing, Timo; Krejca, Martin S.; Sutton, Andrew M. The Compact Genetic Algorithm is Efficient under Extreme Gaussian Noise. IEEE Transactions on Evolutionary Computation 2017: 477490
Practical optimization problems frequently include uncertainty about the quality measure, for example due to noisy evaluations. Thus, they do not allow for a straightforward application of traditional optimization techniques. In these settings, randomized search heuristics such as evolutionary algorithms are a popular choice because they are often assumed to exhibit some kind of resistance to noise. Empirical evidence suggests that some algorithms, such as estimation of distribution algorithms (EDAs) are robust against a scaling of the noise intensity, even without resorting to explicit noisehandling techniques such as resampling. In this paper, we want to support such claims with mathematical rigor. We introduce the concept of graceful scaling in which the run time of an algorithm scales polynomially with noise intensity. We study a monotone fitness function over binary strings with additive noise taken from a Gaussian distribution. We show that myopic heuristics cannot efficiently optimize the function under arbitrarily intense noise without any explicit noisehandling. Furthermore, we prove that using a population does not help. Finally we show that a simple EDA called the compact Genetic Algorithm can overcome the shortsightedness of mutationonly heuristics to scale gracefully with noise. We conjecture that recombinative genetic algorithms also have this property.

Friedrich, Tobias; Krohmer, Anton; Rothenberger, Ralf; Sutton, Andrew M. Phase Transitions for ScaleFree SAT Formulas. Association for the Advancement of Artificial Intelligence (AAAI) 2017: 38933899
Recently, a number of nonuniform random satisfiability models have been proposed that are closer to practical satisfiability problems in some characteristics. In contrast to uniform random Boolean formulas, scalefree formulas have a variable occurrence distribution that follows a power law. It has been conjectured that such a distribution is a more accurate model for some industrial instances than the uniform random model. Though it seems that there is already an awareness of a threshold phenomenon in such models, there is still a complete picture lacking. In contrast to the uniform model, the critical density threshold does not lie at a single point, but instead exhibits a functional dependency on the powerlaw exponent. For scalefree formulas with clauses of length \(k = 2\), we give a lower bound on the phase transition threshold as a function of the scaling parameter. We also perform computational studies that suggest our bound is tight and investigate the critical density for formulas with higher clause lengths. Similar to the uniform model, on formulas with \(k \geq 3\), we find that the phase transition regime corresponds to a set of formulas that are difficult to solve by backtracking search.

Friedrich, Tobias; Krohmer, Anton; Rothenberger, Ralf; Sauerwald, Thomas; Sutton, Andrew M. Bounds on the Satisfiability Threshold for Power Law Distributed Random SAT. European Symposium on Algorithms (ESA) 2017: 37:137:15
Propositional satisfiability (SAT) is one of the most fundamental problems in computer science. The worstcase hardness of SAT lies at the core of computational complexity theory. The averagecase analysis of SAT has triggered the development of sophisticated rigorous and nonrigorous techniques for analyzing random structures. Despite a long line of research and substantial progress, nearly all theoretical work on random SAT assumes a uniform distribution on the variables. In contrast, realworld instances often exhibit large fluctuations in variable occurrence. This can be modeled by a scalefree distribution of the variables, which results in distributions closer to industrial SAT instances. We study random \(k\)SAT on \(n\) variables, \(m=\Theta(n)\) clauses, and a power law distribution on the variable occurrences with exponent \(\beta\). We observe a satisfiability threshold at \(\beta=(2k1)/(k1)\). This threshold is tight in the sense that instances with \(beta < (2k1)/(k1)\varepsilon\) for any constant \(\varepsilon>0\) are unsatisfiable with high probability (w.h.p.). For \(\beta\ge(2k1)/(k1)+\varepsilon\), the picture is reminiscent of the uniform case: instances are satisfiable w.h.p. for sufficiently small constant clausevariable ratios \(m/n\); they are unsatisfiable above a ratio \(m/n\) that depends on \(\beta\).

Friedrich, Tobias; Kötzing, Timo; Quinzan, Francesco; Sutton, Andrew Michael Resampling vs Recombination: a Statistical Run Time Estimation. Foundations of Genetic Algorithms (FOGA) 2017: 2535
Noise is pervasive in realworld optimization, but there is still little understanding of the interplay between the operators of randomized search heuristics and explicit noisehandling techniques, such as statistical resampling. In this paper, we report on several statistical models and theoretical results that help to clarify this reciprocal relationship for a collection of randomized search heuristics on noisy functions. We consider the optimization of pseudoBoolean functions under additive posterior Gaussian noise and explore the tradeo between noise reduction and the computational cost of resampling. We first perform experiments to find the optimal parameters at a given noise intensity for a mutationonly evolutionary algorithm, a genetic algorithm employing recombination, an estimation of distribution algorithm (EDA), and an ant colony optimization algorithm. We then observe how the optimal parameter depends on the noise intensity for the different algorithms. Finally, we locate the point where statistical resampling costs more than it is worth in terms of run time. We find that the EA requires the highest number of resamples to obtain the best speedup, whereas crossover reduces both the run time and the number of resamples required. Most surprisingly, we find that EDAlike algorithms require no resampling, and can handle noise implicitly.

Sutton, Andrew M. Superpolynomial Lower Bounds for the (1+1) EA on Some Easy Combinatorial Problems. Algorithmica 2016: 507528
The (1+1) EA is a simple evolutionary algorithm that is known to be efficient on linear functions and on some combinatorial optimization problems. In this paper, we rigorously study its behavior on two easy combinatorial problems: finding the 2coloring of a class of bipartite graphs, and constructing satisfying assignments for a class of satisfiable 2CNF Boolean formulas. We prove that it is inefficient on both problems in the sense that the number of iterations the algorithm needs to minimize the cost functions is superpolynomial with high probability. Our motivation is to better understand the influence of problem instance structure on the runtime character of a simple evolutionary algorithm. We are interested in what kind of structural features give rise to socalled metastable states at which, with probability \(1  o(1)\), the (1+1) EA becomes trapped and subsequently has difficulty leaving. Finally, we show how to modify the (1+1) EA slightly in order to obtain a polynomialtime performance guarantee on both problems.

Friedrich, Tobias; Kötzing, Timo; Krejca, Martin S.; Sutton, Andrew M. Robustness of Ant Colony Optimization to Noise. Evolutionary Computation 2016: 237254
Recently Ant Colony Optimization (ACO) algorithms have been proven to be efficient in uncertain environments, such as noisy or dynamically changing fitness functions. Most of these analyses focus on combinatorial problems, such as path finding. We analyze an ACO algorithm in a setting where we try to optimize the simple OneMax test function, but with additive posterior noise sampled from a Gaussian distribution. Without noise the classical \((\mu+1)\)EA outperforms any ACO algorithm, with smaller \(\mu\) being better; however, with large noise, the \((\mu+1)\)EA fails, even for high values of \(\mu\) (which are known to help against small noise). In this paper we show that ACO is able to deal with arbitrarily large noise in a graceful manner, that is, as long as the evaporation factor \(p\) is small enough dependent on the parameter \(\delta^2\) of the noise and the dimension \(n\) of the search space \((p = o(1/(n(n + \delta \log n)^2 \log n)))\), optimization will be successful.

Friedrich, Tobias; Kötzing, Timo; Quinzan, Francesco; Sutton, Andrew M. Ant Colony Optimization Beats Resampling on Noisy Functions. Genetic and Evolutionary Computation Conference (GECCO) 2016: 34
Despite the pervasiveness of noise in realworld optimization, there is little understanding of the interplay between the operators of randomized search heuristics and explicit noisehandling techniques such as statistical resampling. Ant Colony Optimization (ACO) algorithms are claimed to be particularly wellsuited to dynamic and noisy problems, even without explicit noisehandling techniques. In this work, we empirically investigate the tradeoffs between resampling an the noisehandling abilities of ACO algorithms. Our main focus is to locate the point where resampling costs more than it is worth.

Friedrich, Tobias; Kötzing, Timo; Krejca, Martin S.; Sutton, Andrew M. The Benefit of Recombination in Noisy Evolutionary Search. Genetic and Evolutionary Computation Conference (GECCO) 2016: 161162
Practical optimization problems frequently include uncertainty about the quality measure, for example due to noisy evaluations. Thus, they do not allow for a straightforward application of traditional optimization techniques. In these settings, randomized search heuristics such as evolutionary algorithms are a popular choice because they are often assumed to exhibit some kind of resistance to noise. Empirical evidence suggests that some algorithms, such as estimation of distribution algorithms (EDAs) are robust against a scaling of the noise intensity, even without resorting to explicit noisehandling techniques such as resampling. In this paper, we want to support such claims with mathematical rigor. We introduce the concept of graceful scaling in which the run time of an algorithm scales polynomially with noise intensity. We study a monotone fitness function over binary strings with additive noise taken from a Gaussian distribution. We show that myopic heuristics cannot efficiently optimize the function under arbitrarily intense noise without any explicit noisehandling. Furthermore, we prove that using a population does not help. Finally we show that a simple EDA called the Compact Genetic Algorithm can overcome the shortsightedness of mutationonly heuristics to scale gracefully with noise. We conjecture that recombinative genetic algorithms also have this property.

Dang, DucCuong; Friedrich, Tobias; Krejca, Martin S.; Kötzing, Timo; Lehre, Per Kristian; Oliveto, Pietro S.; Sudholt, Dirk; Sutton, Andrew Michael Escaping Local Optima with Diversity Mechanisms and Crossover. Genetic and Evolutionary Computation Conference (GECCO) 2016: 645652
Population diversity is essential for the effective use of any crossover operator. We compare seven commonly used diversity mechanisms and prove rigorous run time bounds for the \((\mu+1)\) GA using uniform crossover on the fitness function \(Jump_k\). All previous results in this context only hold for unrealistically low crossover probability \(p_c=O(k/n)\), while we give analyses for the setting of constant \(p_c < 1\) in all but one case. Our bounds show a dependence on the problem size \(n\), the jump length \(k\), the population size \(\mu\), and the crossover probability \(p_c\). For the typical case of constant \(k > 2\) and constant \(p_c\), we can compare the resulting expected optimisation times for different diversity mechanisms assuming an optimal choice of \(\mu\): \(O(n^{k1})\) for duplicate elimination/minimisation, \(O(n^2 \log n)\) for maximising the convex hull, \(O(n \log n)\) for det. crowding (assuming \(p_c = k/n\)), \(O(n \log n)\) for maximising the Hamming distance, \(O(n \log n)\) for fitness sharing, \(O(n \log n)\) for the singlereceiver island model. This proves a sizeable advantage of all variants of the \((\mu+1)\) GA compared to the (1+1) EA, which requires \(\Theta(n^k)\). In a short empirical study we confirm that the asymptotic differences can also be observed experimentally.

Friedrich, Tobias; Kötzing, Timo; Krejca, Martin S.; Sutton, Andrew M. Graceful Scaling on Uniform versus SteepTailed Noise. Parallel Problem Solving From Nature (PPSN) 2016: 761770
Recently, different evolutionary algorithms (EAs) have been analyzed in noisy environments. The most frequently used noise model for this was additive posterior noise (noise added after the fitness evaluation) taken from a Gaussian distribution. In particular, for this setting it was shown that the \((\mu + 1)\)EA on OneMax does not scale gracefully (higher noise cannot efficiently be compensated by higher \(\mu\)). In this paper we want to understand whether there is anything special about the Gaussian distribution which makes the \((\mu + 1)\)EA not scale gracefully. We keep the setting of posterior noise, but we look at other distributions. We see that for exponential tails the \((\mu + 1)\)EA on OneMax does also not scale gracefully, for similar reasons as in the case of Gaussian noise. On the other hand, for uniform distributions (as well as other, similar distributions) we see that the \((\mu + 1)\)EA on OneMax does scale gracefully, indicating the importance of the noise model.

Friedrich, Tobias; Kötzing, Timo; Sutton, Andrew M. On the Robustness of Evolving Populations. Parallel Problem Solving From Nature (PPSN) 2016: 771781
Most theoretical work that studies the benefit of recombination focuses on the ability of crossover to speed up optimization time on specific search problems. In this paper, we take a slightly different perspective and investigate recombination in the context of evolving solutions that exhibit \(\emph{mutational}\) robustness, i.e., they display insensitivity to small perturbations. Various models in population genetics have demonstrated that increasing the effective recombination rate promotes the evolution of robustness. We show this result also holds in the context of evolutionary computation by proving crossover promotes the evolution of robust solutions in the standard \((\mu+1)\) GA. Surprisingly, our results show that the effect is present even when robust solutions are at a selective disadvantage due to lower fitness values.

Dang, DucCuong; Lehre, Per Kristian; Friedrich, Tobias; Kötzing, Timo; Krejca, Martin S.; Oliveto, Pietro S.; Sudholt, Dirk; Sutton, Andrew M. Emergence of Diversity and its Benefits for Crossover in Genetic Algorithms. Parallel Problem Solving From Nature (PPSN) 2016: 890900
Population diversity is essential for avoiding premature convergence in Genetic Algorithms (GAs) and for the effective use of crossover. Yet the dynamics of how diversity emerges in populations are not well understood. We use rigorous runtime analysis to gain insight into population dynamics and GA performance for a standard \((\mu+1)\) GA and the \(Jump_k\) test function. By studying the stochastic process underlying the size of the largest collection of identical genotypes we show that the interplay of crossover followed by mutation may serve as a catalyst leading to a sudden burst of diversity. This leads to improvements of the expected optimisation time of order \(\Omega(n/ \log n)\) compared to mutationonly algorithms like the \((1+1)\) EA.

Chicano, Francisco; Sutton, Andrew M.; Whitley, L. Darrell; Alba, Enrique Fitness Probability Distribution of BitFlip Mutation. Evolutionary Computation 2015: 217248
Bitflip mutation is a common mutation operator for evolutionary algorithms applied to optimize functions over binary strings. In this paper, we develop results from the theory of landscapes and Krawtchouk polynomials to exactly compute the probability distribution of fitness values of a binary string undergoing uniform bitflip mutation. We prove that this probability distribution can be expressed as a polynomial in \(p\), the probability of flipping each bit. We analyze these polynomials and provide closedform expressions for an easy linear problem Onemax, and an NPhard problem, MAXSAT. We also discuss a connection of the results with runtime analysis.

Paixão, Tiago; Badkobeh, Golnaz; Barton, Nick H.; Çörüş, Doğan; Dang, DucCuong; Friedrich, Tobias; Lehre, Per Kristian; Sudholt, Dirk; Sutton, Andrew; Trubenová, Barbora Toward a unifying framework for evolutionary processes. Journal of Theoretical Biology 2015: 2843
The theory of population genetics and evolutionary computation have been evolving separately for nearly 30 years. Many results have been independently obtained in both fields and many others are unique to its respective field. We aim to bridge this gap by developing a unifying framework for evolutionary processes that allows both evolutionary algorithms and population genetics models to be cast in the same formal framework. The framework we present here decomposes the evolutionary process into its several components in order to facilitate the identification of similarities between different models. In particular, we propose a classification of evolutionary operators based on the defining properties of the different components. We cast several commonly used operators from both fields into this common framework. Using this, we map different evolutionary and genetic algorithms to different evolutionary regimes and identify candidates with the most potential for the translation of results between the fields. This provides a unified description of evolutionary processes and represents a stepping stone towards new tools and results to both fields.

Nguyen, Anh Quang; Sutton, Andrew M.; Neumann, Frank Population size matters: Rigorous runtime results for maximizing the hypervolume indicator. Theoretical Computer Science 2015: 2436
Using the hypervolume indicator to guide the search of evolutionary multiobjective algorithms has become very popular in recent years. We contribute to the theoretical understanding of these algorithms by carrying out rigorous runtime analyses. We consider multiobjective variants of the problems OneMax and LeadingOnes called OMM and LOTZ, respectively, and investigate hypervolumebased algorithms with population sizes that do not allow coverage of the entire Pareto front. Our results show that LOTZ is easier to optimize than OMM for hypervolumebased evolutionary multiobjective algorithms which is contrary to the results on their singleobjective variants and the wellstudied (1+1)EA.

Friedrich, Tobias; Kötzing, Timo; Krejca, Martin S.; Sutton, Andrew M. Robustness of Ant Colony Optimization to Noise. Genetic and Evolutionary Computation Conference (GECCO) 2015: 1724
Best Paper Award (ACO/SI Track)
Recently Ant Colony Optimization (ACO) algorithms have been proven to be efficient in uncertain environments, such as noisy or dynamically changing fitness functions. Most of these analyses focus on combinatorial problems, such as path finding. We analyze an ACO algorithm in a setting where we try to optimize the simple OneMax test function, but with additive posterior noise sampled from a Gaussian distribution. Without noise the classical \((\mu+1)\)EA outperforms any ACO algorithm, with smaller \(\mu\) being better; however, with large noise, the \((\mu+1)\)EA fails, even for high values of \(\mu\) (which are known to help against small noise). In this paper we show that ACO is able to deal with arbitrarily large noise in a graceful manner, that is, as long as the evaporation factor \(\mu\) is small enough dependent on the parameter \(\delta^2\) of the noise and the dimension \(n\) of the search space \((p = o(1/(n(n + \delta \log n)^2 \log n)))\), optimization will be successful.

Doerr, Benjamin; Neumann, Frank; Sutton, Andrew M. Improved Runtime Bounds for the (1+1) EA on Random 3CNF Formulas Based on FitnessDistance Correlation. Genetic and Evolutionary Computation Conference (GECCO) 2015: 14151422
With this paper, we contribute to the theoretical understanding of randomized search heuristics by investigating their behavior on random 3SAT instances. We improve the results for the (1+1) EA obtained by Sutton and Neumann [PPSN 2014, 942951] in three ways. First, we reduce the upper bound by a linear factor and prove that the (1+1) EA obtains optimal solutions in time \(O(n \log n)\) with high probability on asymptotically almost all highdensity satisfiable 3CNF formulas. Second, we extend the range of densities for which this bound holds to satisfiable formulas of at least logarithmic density. Finally, we complement these mathematical results with numerical experiments that summarize the behavior of the (1+1) EA on formulas along the density spectrum, and suggest that the implicit constants hidden in our bounds are low. Our proofs are based on analyzing the run of the algorithm by establishing a fitnessdistance correlation. This approach might be of independent interest and we are optimistic that it is useful for the analysis of randomized search heuristics in various other settings. To our knowledge, this is the first time that fitnessdistance correlation is explicitly used to rigorously prove a performance statement for an evolutionary algorithm.

Friedrich, Tobias; Kötzing, Timo; Krejca, Martin S.; Sutton, Andrew M. The Benefit of Recombination in Noisy Evolutionary Search. International Symposium of Algorithms and Computation (ISAAC) 2015: 140150
Practical optimization problems frequently include uncertainty about the quality measure, for example due to noisy evaluations. Thus, they do not allow for a straightforward application of traditional optimization techniques. In these settings metaheuristics are a popular choice for deriving good optimization algorithms, most notably evolutionary algorithms which mimic evolution in nature. Empirical evidence suggests that genetic recombination is useful in uncertain environments because it can stabilize a noisy fitness signal. With this paper we want to support this claim with mathematical rigor. The setting we consider is that of noisy optimization. We study a simple noisy fitness function that is derived by adding Gaussian noise to a monotone function. First, we show that a classical evolutionary algorithm that does not employ sexual recombination (the \((\mu+1)\)EA) cannot handle the noise efficiently, regardless of the population size. Then we show that an evolutionary algorithm which does employ sexual recombination (the Compact Genetic Algorithm, short: cGA) can handle the noise using a graceful scaling of the population.

Sutton, Andrew M.; Neumann, Frank; Nallaperuma, Samadhi Parameterized Runtime Analyses of Evolutionary Algorithms for the Planar Euclidean Traveling Salesperson Problem. Evolutionary Computation 2014: 595628
Parameterized runtime analysis seeks to understand the influence of problem structure on algorithmic runtime. In this paper, we contribute to the theoretical understanding of evolutionary algorithms and carry out a parameterized analysis of evolutionary algorithms for the Euclidean traveling salesperson problem (Euclidean TSP). We investigate the structural properties in TSP instances that influence the optimization process of evolutionary algorithms and use this information to bound their runtime. We analyze the runtime in dependence of the number of inner points \(k\). In the first part of the paper, we study a \((\mu + \lambda)\) EA in a strictly black box setting and show that it can solve the Euclidean TSP in expected time \(O(n cdot A(\epsilon) cdot max\(\mu / \lambda) cdot n^2 , 1 + (\mu / \lambda) cdot n^4k(2k  1)!)\) where \(A\) is a function of the minimum angle \(\epsilon\) between any three points. Based on insights provided by the analysis, we improve this upper bound by introducing a mixed mutation strategy that incorporates both 2opt moves and permutation jumps. This strategy improves the upper bound to \(O(n cdot A(\epsilon) cdot max\(\mu / \lambda) cdot n^2 , 1 + (\mu / \lambda) cdot n^2k(k  1)!)\). In the second part of the paper, we use the information gained in the analysis to incorporate domain knowledge to design two fixedparameter tractable (FPT) evolutionary algorithms for the planar Euclidean TSP. We first develop a \((\mu + \lambda)\) EA based on an analysis by M. Theile, 2009, "Exact solutions to the traveling salesperson problem by a populationbased evolutionary algorithm," Lecture notes in computer science, Vol. 5482 (pp. 145155), that solves the TSP with \(k\) inner points in \(O(max\{2^kk^2n^2\lambda^{1},n\})\) generations with probability \(1  e^\Omega(n)}\). We then design a \((\mu + \lambda)\) EA that incorporates a dynamic programming step into the fitness evaluation. We prove that a variant of this evolutionary algorithm using 2opt mutation solves the problem after \(O(max\{(k  2)!k^2k2\lambda^{1, 1\})\) steps in expectation with a cost of \(O(nk)\) for each fitness evaluation.

Whitley, Darrell; Sutton, Andrew M.; Ochoa, Gabriela; Chicano, Francisco The component model for elementary landscapes and partial neighborhoods. Theoretical Computer Science 2014: 5975
Local search algorithms exploit moves on an adjacency graph of the search space. An "elementary landscape" exists if the objective function \(f\) is an eigenfunction of the Laplacian of the graph induced by the neighborhood operator; this allows various statistics about the neighborhood to be computed in closed form. A new component based model makes it relatively simple to prove that certain types of landscapes are elementary. The traveling salesperson problem, weighted graph (vertex) coloring and the minimum graph bisection problem yield elementary landscapes under commonly used local search operators. The component model is then used to efficiently compute the mean objective function value over partial neighborhoods for these same problems. For a traveling salesperson problem over \(n\) cities, the 2opt neighborhood can be decomposed into \(\lfloor n/2  1\rfloor\) partial neighborhoods. For graph coloring and the minimum graph bisection problem, partial neighborhoods can be used to focus search on those moves that are capable of producing a solution with a strictly improving objective function value.

Chicano, Francisco; Whitley, Darrell; Sutton, Andrew M. Efficient identification of improving moves in a ball for pseudoboolean problems. Genetic and Evolutionary Computation Conference (GECCO) 2014: 437444
Hill climbing algorithms are at the core of many approaches to solve optimization problems. Such algorithms usually require the complete enumeration of a neighborhood of the current solution. In the case of problems defined over binary strings of length \(n\), we define the \(r\)ball neighborhood as the set of solutions at Hamming distance \(r\) or less from the current solution. For \(r \ll n\) this neighborhood contains \(\Theta(n^r)\) solutions. In this paper efficient methods areintroduced to locate improving moves in the \(r\)ball neighborhood for problems that can be written as a sum of a linear number of subfunctions depending on a bounded number of variables. NKlandscapes and MAX\(k\)SAT are examples of these problems. If the number of subfunctions depending on any given variable is also bounded, then we prove that the method can explore the neighborhood in constant time, despite the fact that the number of solutions in the neighborhood is polynomial in \(n\). We develop a hill climber based on our exploration method and we analyze its efficiency and efficacy using experiments with NKqlandscapes instances.

Sutton, Andrew M.; Neumann, Frank Runtime Analysis of Evolutionary Algorithms on Randomly Constructed HighDensity Satisfiable 3CNF Formulas. Parallel Problem Solving from Nature (PPSN) 2014: 942951
We show that simple mutationonly evolutionary algorithms find a satisfying assignment on two similar models of random planted 3CNF Boolean formulas in polynomial time with high probability in the high constraint density regime. We extend the analysis to random formulas conditioned on satisfiability (i.e., the socalled filtered distribution) and conclude that most highdensity satisfiable formulas are easy for simple evolutionary algorithms. With this paper, we contribute the first rigorous study of randomized search heuristics from the evolutionary computation community on wellstudied distributions of random satisfiability problems.

Sutton, Andrew M.; Chicano, Francisco; Whitley, L. Darrell Fitness Function Distributions over Generalized Search Neighborhoods in the qary Hypercube. Evolutionary Computation 2013: 561590
The frequency distribution of a fitness function over regions of its domain is an important quantity for understanding the behavior of algorithms that employ randomized sampling to search the function. In general, exactly characterizing this distribution is at least as hard as the search problem, since the solutions typically live in the tails of the distribution. However, in some cases it is possible to efficiently retrieve a collection of quantities called moments that describe the distribution. In this paper, we consider functions of bounded epistasis that are defined over length\(n\) strings from a finite alphabet of cardinality \(q\). Many problems in combinatorial optimization can be specified as search problems over functions of this type. Employing Fourier analysis of functions over finite groups, we derive an efficient method for computing the exact moments of the frequency distribution of fitness functions over Hamming regions of the \(q\)ary hypercube. We then use this approach to derive equations that describe the expected fitness of the offspring of any point undergoing uniform mutation. The results we present provide insight into the statistical structure of the fitness function for a number of combinatorial problems. For the graph coloring problem, we apply our results to efficiently compute the average number of constraint violations that lie within a certain number of steps of any coloring. We derive an expression for the mutation rate that maximizes the expected fitness of an offspring at each fitness level. We also apply the results to the slightly more complex frequency assignment problem, a relevant application in the domain of the telecommunications industry. As with the graph coloring problem, we provide formulas for the average value of the fitness function in Hamming regions around a solution and the expectationoptimal mutation rate.

Nallaperuma, Samadhi; Sutton, Andrew M.; Neumann, Frank Fixedparameter evolutionary algorithms for the Euclidean Traveling Salesperson problem. Congress on Evolutionary Computation (CEC) 2013: 20372044
We contribute to the theoretical understanding of evolutionary algorithms and carry out a parameterized analysis of evolutionary algorithms for the Euclidean traveling salesperson problem (Euclidean TSP). We exploit structural properties related to the optimization process of evolutionary algorithms for this problem and use them to bound the runtime of evolutionary algorithms. Our analysis studies the runtime in dependence of the number of inner points \(k\) and shows that simple evolutionary algorithms solve the Euclidean TSP in expected time \(O(n^{4k(2k  1)!)\). Moreover, we show that, under reasonable geometric constraints, a locally optimal 2opt tour can be found by randomized local search in expected time \(O(n^{2k}!)\).

Nallaperuma, Samadhi; Sutton, Andrew M.; Neumann, Frank Parameterized complexity analysis and more effective construction methods for ACO algorithms and the euclidean traveling salesperson problem. Congress on Evolutionary Computation (CEC) 2013: 20452052
We propose a new construction procedure for ant colony optimization (ACO) algorithms working on the Euclidean traveling salesperson problem (TSP) that preserves the ordering on the convex hull of the points in the instance. The procedure is inspired by theoretical analyses for simple evolutionary algorithms that are provably more efficient on instances where the number of inner points of the instance is not too large. We integrate the construction procedure into the wellknown MaxMin Ant System (MMAS) and empirically show that it leads to more efficient optimization on instances where the number of inner points is not too high.

Sutton, Andrew M.; Whitley, L. Darrell; Howe, Adele E. Computing the moments of kbounded pseudoBoolean functions over Hamming spheres of arbitrary radius in polynomial time. Theoretical Computer Science 2012: 5874
We show that given a \(k\)bounded pseudoBoolean function \(f\), one can always compute the \(c\)th moment of \(f\) over regions of arbitrary radius in Hamming space in polynomial time using algebraic information from the adjacency structure (where \(k\) and \(c\) are constants). This result has implications for evolutionary algorithms and local search algorithms because information about promising regions of the search space can be efficiently retrieved, even if the cardinality of the region is exponential in the problem size. Finally, we use our results to introduce a method of efficiently calculating the expected fitness of mutations for evolutionary algorithms.

Whitley, Darrell; Sutton, Andrew M. Genetic Algorithms  A Survey of Models and Methods. Handbook of Natural Computing 2012: 637671
This chapter first reviews the simple genetic algorithm. Mathematical models of the genetic algorithm are also reviewed, including the schema theorem, exact infinite population models, and exact Markov models for finite populations. The use of bit representations, including Gray encodings and binary encodings, is discussed. Selection, including roulette wheel selection, rankbased selection, and tournament selection, is also described. This chapter then reviews other forms of genetic algorithms, including the steadystate Genitor algorithm and the CHC (crossgenerational elitist selection, heterogenous recombination, and cataclysmic mutation) algorithm. Finally, landscape structures that can cause genetic algorithms to fail are looked at, and an application of genetic algorithms in the domain of resource scheduling, where genetic algorithms have been highly successful, is also presented.

Sutton, Andrew M.; Neumann, Frank A Parameterized Runtime Analysis of Evolutionary Algorithms for the Euclidean Traveling Salesperson Problem. Conference on Artificial Intelligence (AAAI) 2012
Parameterized runtime analysis seeks to understand the influence of problem structure on algorithmic runtime. In this paper, we contribute to the theoretical understanding of evolutionary algorithms and carry out a parameterized analysis of evolutionary algorithms for the Euclidean traveling salesperson problem (Euclidean TSP). We investigate the structural properties in TSP instances that influence the optimization process of evolutionary algorithms and use this information to bound the runtime of simple evolutionary algorithms. Our analysis studies the runtime in dependence of the number of inner points \(k\) and shows that \((\mu + \lambda)\) evolutionary algorithms solve the Euclidean TSP in expected time \(O((\mu / \lambda) cdot n^3 \gamma(\epsilon) + n \gamma(\epsilon)+(\mu / \lambda) cdot n^4k(2k1)!)\) where \(\gamma\) is a function of the minimum angle \(epsilon\) between any three points. Finally, our analysis provides insights into designing a mutation operator that improves the upper bound on expected runtime. We show that a mixed mutation strategy that incorporates both 2opt moves and permutation jumps results in an upper bound of \(O((\mu / \lambda) cdot n^3 \gamma(\epsilon) + n \gamma(\epsilon) + (\mu / \lambda) cdot n^2k(k1)!)\) for the \((\mu + \lambda)\) EA.

Sutton, Andrew M.; Day, Jareth; Neumann, Frank A parameterized runtime analysis of evolutionary algorithms for MAX2SAT. Genetic and Evolutionary Computation Conference (GECCO) 2012: 433440
We investigate the MAX2SAT problem and study evolutionary algorithms by parameterized runtime analysis. The parameterized runtime analysis of evolutionary algorithms has been initiated recently and reveals new insights into which type of instances of NPhard combinatorial optimization problems are hard to solve by evolutionary computing methods. We show that a variant of the (1+1) EA is a fixedparameter evolutionary algorithm with respect to the standard parameterization for MAX2SAT. Furthermore, we study how the dependencies between the variables affect problem difficulty and present fixedparameter evolutionary algorithms for the MAX(2,3)SAT problem where the studied parameter is the diameter of the variable graph.

Sutton, Andrew M.; Neumann, Frank A Parameterized Runtime Analysis of Simple Evolutionary Algorithms for Makespan Scheduling. Parallel Problem Solving from Nature (PPSN) 2012: 5261
We consider simple multistart evolutionary algorithms applied to the classical NPhard combinatorial optimization problem of Makespan Scheduling on two machines. We study the dependence of the runtime of this type of algorithm on three different key hardness parameters. By doing this, we provide further structural insights into the behavior of evolutionary algorithms for this classical problem.

Sutton, Andrew M.; Whitley, Darrell; Howe, Adele E. Approximating the distribution of fitness over hamming regions. Foundations of Genetic Algorithms (FOGA) 2011: 93104
The distribution of fitness values across a set of states sharply influences the dynamics of evolutionary processes and heuristic search in combinatorial optimization. In this paper we present a method for approximating the distribution of fitness values over Hamming regions by solving a linear programming problem that incorporates low order moments of the target function. These moments can be retrieved in polynomial time for select problems such as MAX\(k\)SAT using Walsh analysis. The method is applicable to any real function on binary strings that is epistatically bounded and discrete with asymptotic bounds on the cardinality of its codomain. We perform several studies on the ONEMAX and MAX\(k\)SAT domains to assess the accuracy of the approximation and its dependence on various factors. We show that the approximation can accurately predict the number of states within a Hamming region that have an improving fitness value.

Sutton, Andrew M.; Whitley, Darrell; Howe, Adele E. Mutation rates of the (1+1)EA on pseudoboolean functions of bounded epistasis. Genetic and Evolutionary Computation Conference (GECCO) 2011: 973980
When the epistasis of the fitness function is bounded by a constant, we show that the expected fitness of an offspring of the (1+1)EA can be efficiently computed for any point. Moreover, we show that, for any point, it is always possible to efficiently retrieve the "best" mutation rate at that point in the sense that the expected fitness of the resulting offspring is maximized. On linear functions, it has been shown that a mutation rate of \(1/n\) is provably optimal. On functions where epistasis is bounded by a constant \(k\), we show that for sufficiently high fitness, the commonly used mutation rate of \(1/n\) is also best, at least in terms of maximizing the expected fitness of the offspring. However, we find for certain ranges of the fitness function, a better mutation rate can be considerably higher, and can be found by solving for the real roots of a degree\(k\) polynomial whose coefficients contain the nonzero Walsh coefficients of the fitness function. Simulation results on maximum \(k\)satisfiability problems and NKlandscapes show that this expectationmaximized mutation rate can cause significant gains early in search.

Sutton, Andrew M.; Whitley, L. Darrell; Howe, Adele E. A polynomial time computation of the exact correlation structure of ksatisfiability landscapes. Genetic and Evolutionary Computation Conference (GECCO) 2009: 365372
The autocorrelation function and related correlation length are statistical quantities that capture the ruggedness of the fitness landscape: a measure that is directly related to the hardness of a problem for certain heuristic search algorithms. Typically, these quantities are estimated empirically by sampling along a random walk. In this paper, we show that a polynomialtime Walsh decomposition of the \(k\)satisfiability evaluation function allows us to compute the exact autocorrelation function and correlation length for any given \(k\)satisfiability instance. We also use the decomposition to compute a theoretical expectation for the autocorrelation function and correlation length over the ensemble of instances generated uniformly at random. We find that this expectation is invariant to the constrainedness of the problem as measured by the ratio of clauses to variables. However, we show that filtered problems, which are typically used in local search studies, have a bias that causes a significant deviation from the expected correlation structure of unfiltered, uniformly generated problems.

Whitley, L. Darrell; Sutton, Andrew M. Partial neighborhoods of elementary landscapes. Genetic and Evolutionary Computation Conference (GECCO) 2009: 381388
This paper introduces a new component based model that makes it relatively simple to prove that certain types of landscapes are elementary. We use the model to reconstruct proofs for the Traveling Salesman Problem, Graph Coloring and MinCut Graph Partitioning. The same model is then used to efficiently compute the average values over particular partial neighborhoods for these same problems. For Graph Coloring and MinCut Graph Partitioning, this computation can be used to focus search on those moves that are most likely to yield an improving move, ignoring moves that cannot yield an improving move. Let \(x\) be a candidate solution with objective function value \(f(x)\). The mean value of the objective function over the entire landscape is denoted \(f\). Normally in an elementary landscape one can only be sure that a neighborhood includes an improving move (assuming minimization) if \(f(x) > f\). However, by computing the expected value of an appropriate partial neighborhood it is sometimes possible to know that an improving move exists in the partial neighborhood even when \(f(x) < f\).

Sutton, Andrew M.; Howe, Adele E.; Whitley, L. Darrell Estimating Bounds on Expected Plateau Size in MAXSAT Problems. Stochastic Local Search Algorithms (SLS) 2009: 3145
Stochastic local search algorithms can now successfully solve MAXSAT problems with thousands of variables or more. A key to this success is how effectively the search can navigate and escape plateau regions. Furthermore, the solubility of a problem depends on the size and exit density of plateaus, especially those closest to the optimal solution. In this paper we model the plateau phenomenon as a percolation process on hypercube graphs. We develop two models for estimating bounds on the size of plateaus and prove that one is a lower bound and the other an upper bound on the expected size of plateaus at a given level. The models' accuracy is demonstrated on controlled random hypercube landscapes. We apply the models to MAXSAT through analogy to hypercube graphs and by introducing an approach to estimating, through sampling, a key parameter of the models. Using this approach, we assess the accuracy of our bound estimations on uniform random and structured benchmarks. Surprisingly, we find similar trends in accuracy across random and structured problem instances. Less surprisingly, we find a high accuracy on smaller plateaus with systematic divergence as plateaus increase in size.

Sutton, Andrew M.; Howe, Adele E.; Whitley, L. Darrell A Theoretical Analysis of the kSatisfiability Search Space. Stochastic Local Search Algorithms (SLS) 2009: 4660
Local search algorithms perform surprisingly well on the \(k\)satisfiability (\(k\)SAT) problem. However, few theoretical analyses of the \(k\)SAT search space exist. In this paper we study the search space of the \(k\)SAT problem and show that it can be analyzed by a decomposition. In particular, we prove that the objective function can be represented as a superposition of exactly \(k\) elementary landscapes. We show that this decomposition allows us to immediately compute the expectation of the objective function evaluated across neighboring points. We use this result to prove previously unknown bounds for local maxima and plateau width in the 3SAT search space. We compute these bounds numerically for a number of instances and show that they are nontrivial across a large set of benchmarks.

Matthews, David C.; Sutton, Andrew M.; Hains, Doug; Whitley, L. Darrell Improved Robustness through Population Variance in Ant Colony Optimization. Stochastic Local Search Algorithms (SLS) 2009: 145149
Ant Colony Optimization algorithms are populationbased Stochastic Local Search algorithms that mimic the behavior of ants, simulating pheromone trails to search for solutions to combinatorial optimization problems. This paper introduces Population Variance, a novel approach to ACO algorithms that allows parameters to vary across the population over time, leading to solution construction differences that are not strictly stochastic. The increased exploration appears to help the search escape from local optima, significantly improving the robustness of the algorithm with respect to suboptimal parameter settings.

Whitley, Darrell; Sutton, Andrew M.; Howe, Adele E.; Barbulescu, Laura Resource Scheduling with Permutation Based Representations: Three Applications. Evolutionary Computation in Practice 2008: 219243
Resource based scheduling using permutation based representations is reviewed. Permutation based representations are used in conjunction with genetic algorithms and local search algorithms for solving three very different scheduling problems. First, the Coors warehouse scheduling problem involves finding a permutation of customer orders that minimizes the average time that customers' orders spend at the loading docks while at the same time minimizing the running average inventory. Second, scheduling the Air Force Satellite Control Network (AFSCN) involves scheduling customer requests for contact time with a satellite via a ground station, where slot times on a ground station is the limited resource. The third application is scheduling the tracking of objects in space using ground based radar systems. Both satellites and debris in space must be tracked on regular basis to maintain knowledge about the location and orbit of the object. The ground based radar system is the limited resource, but unlike AFSCN scheduling, this application involves significant uncertainty.

Whitley, Darrell; Sutton, Andrew M.; Howe, Adele E. Understanding elementary landscapes. Genetic and Evolutionary Computation Conference (GECCO) 2008: 585592
The landscape formalism unites a finite candidate solution set to a neighborhood topology and an objective function. This construct can be used to model the behavior of local search on combinatorial optimization problems. A landscape is elementary when it possesses a unique property that results in a relative smoothness and decomposability to its structure. In this paper we explain elementary landscapes in terms of the expected value of solution components which are transformed in the process of moving from an incumbent solution to a neighboring solution. We introduce new results about the properties of elementary landscapes and discuss the practical implications for search algorithms.

Lunacek, Monte; Whitley, Darrell; Sutton, Andrew M. The Impact of Global Structure on Search. Parallel Problem Solving from Nature (PPSN) 2008: 498507
Populationbased methods are often considered superior on multimodal functions because they tend to explore more of the fitness landscape before they converge. We show that the effectiveness of this strategy is highly dependent on a function's global structure. When the local optima are not structured in a predictable way, exploration can misguide search into suboptimal regions. Limiting exploration can result in a better nonintuitive global search strategy.

Sutton, Andrew M.; Lunacek, Monte; Whitley, L. Darrell Differential evolution and nonseparability: using selective pressure to focus search. Genetic and Evolutionary Computation Conference (GECCO) 2007: 14281435
Recent results show that the Differential Evolution algorithm has significant difficulty on functions that are not linearly separable. On such functions, the algorithm must rely primarily on its differential mutation procedure which, unlike its recombination strategy, is rotationally invariant. We conjecture that this mutation strategy lacks sufficient selective pressure when appointing parent and donor vectors to have satisfactory exploitative power on nonseparable functions. We find that imposing pressure in the form of rankbased differential mutation results in a significant improvement of exploitation on rotated benchmarks.

Sutton, Andrew M.; Howe, Adele E.; Whitley, L. Darrell Using Adaptive Priority Weighting to Direct Search in Probabilistic Scheduling. International Conference on Automated Planning and Scheduling (ICAPS) 2007: 320327
Many scheduling problems reside in uncertain and dynamic environments  tasks have a nonzero probability of failure and may need to be rescheduled. In these cases, an optimized solution for a shortterm time horizon may have a detrimental impact over a broader time scale. We examine a scheduling domain in which time and energy on a phased array radar system is allocated to track objects in orbit around the earth. This domain requires probabilistic modeling to optimize the expected number of successful tasks on a particular day. Failed tasks must be attempted again on subsequent days. Given a set of task requests, we study two longterm objectives: percentage of requests initially successful, and the average time between successful request updates. We investigate adaptive priority weighting strategies that directly influence the shortterm objective function and thus indirectly influence the longterm goals. We find that adapting priority weights based on when individual tasks succeed or fail allows a catalog of requests to be filled more quickly. Furthermore, with adaptive priorities, we observe a Paretofront effect between the two longterm objectives as we modify how priorities are weighted, but an inverse effect of weighting when the priorities are not adapted.

Smith, Jay; Briceno, Luis Diego; Maciejewski, Anthony A.; Siegel, Howard Jay; Renner, Timothy; Shestak, Vladimir; Ladd, Joshua; Sutton, Andrew M.; Janovy, David L.; Govindasamy, Sudha; Alqudah, Amin; Dewri, Rinku; Prakash, Puneet Measuring the Robustness of Resource Allocations in a Stochastic Dynamic Environment. International Parallel and Distributed Processing Symposium (IPDPS) 2007: 110
Heterogeneous distributed computing systems often must operate in an environment where system parameters are subject to uncertainty. Robustness can be defined as the degree to which a system can function correctly in the presence of parameter values different from those assumed. We present a methodology for quantifying the robustness of resource allocations in a dynamic environment where task execution times are stochastic. The methodology is evaluated through measuring the robustness of three different resource allocation heuristics within the context of a stochastic dynamic environment. A Bayesian regression model is fit to the combined results of the three heuristics to demonstrate the correlation between the stochastic robustness metric and the presented performance metric. The correlation results demonstrated the significant potential of the stochastic robustness metric to predict the relative performance of the three heuristics given a common objective function.