
Kötzing, Timo Iterative Learning from Positive Data and Counters. Algorithmic Learning Theory (ALT) 2011: 4054
We analyze iterative learning in the limit from positive data with the additional information provided by a counter. The simplest type of counter provides the current iteration number (counting up from 0 to infinity), which is known to improve learning power over plain iterative learning. We introduce five other (weaker) counter types, for example only providing some unbounded and nondecreasing sequence of numbers. Analyzing these types allows one to understand which properties of a counter learning can benefit from. For the iterative setting, we completely characterize the relative power of the learning criteria corresponding to the counter types. In particular, for our types, the only properties improving learning power are unboundedness and strict monotonicity. Furthermore, we show that each of our types of counter improves learning power over weaker ones in some settings; to this end, we analyze transductive and nonUshaped learning. Finally we show that, for iterative learning criteria with one of our types of counter, separations of learning criteria are necessarily witnessed by classes containing only infinite languages.

Friedrich, Tobias; Levine, Lionel Fast Simulation of LargeScale Growth Models. Approximation Algorithms for Combinatorial Optimization (APPROX) 2011: 555566
We give an algorithm that computes the final state of certain growth models without computing all intermediate states. Our technique is based on a "least action principle" which characterizes the odometer function of the growth process. Starting from an approximation for the odometer, we successively correct under and overestimates and provably arrive at the correct final state. The degree of speedup depends on the accuracy of the initial guess. Determining the size of the boundary fluctuations in growth models like internal diffusionlimited aggregation (IDLA) is a longstanding open problem in statistical physics. As an application of our method, we calculate the size of fluctuations over two orders of magnitude beyond previous simulations.

Friedrich, Tobias; Kroeger, Trent; Neumann, Frank Weighted Preferences in Evolutionary Multiobjective Optimization. Australasian Conference on Artificial Intelligence (AUSAI) 2011: 291300
Evolutionary algorithms have been widely used to tackle multiobjective optimization problems. Incorporating preference information into the search of evolutionary algorithms for multiobjective optimization is of great importance as it allows one to focus on interesting regions in the objective space. Zitzler et al. have shown how to use a weight distribution function on the objective space to incorporate preference information into hypervolumebased algorithms. We show that this weighted information can easily be used in other popular EMO algorithms as well. Our results for NSGAII and SPEA2 show that this yields similar results to the hypervolume approach and requires less computational effort.

Friedrich, Tobias; Bringmann, Karl; Voß, Thomas; Igel, Christian The logarithmic hypervolume indicator. Foundations of Genetic Algorithms (FOGA) 2011: 8192
It was recently proven that sets of points maximizing the hypervolume indicator do not give a good multiplicative approximation of the Pareto front. We introduce a new "logarithmic hypervolume indicator" and prove that it achieves a closetooptimal multiplicative approximation ratio. This is experimentally verified on several benchmark functions by comparing the approximation quality of the multiobjective covariance matrix evolution strategy (MOCMAES) with the classic hypervolume indicator and the MOCMAES with the logarithmic hypervolume indicator.

Sutton, Andrew M.; Whitley, Darrell; Howe, Adele E. Approximating the distribution of fitness over hamming regions. Foundations of Genetic Algorithms (FOGA) 2011: 93104
The distribution of fitness values across a set of states sharply influences the dynamics of evolutionary processes and heuristic search in combinatorial optimization. In this paper we present a method for approximating the distribution of fitness values over Hamming regions by solving a linear programming problem that incorporates low order moments of the target function. These moments can be retrieved in polynomial time for select problems such as MAX\(k\)SAT using Walsh analysis. The method is applicable to any real function on binary strings that is epistatically bounded and discrete with asymptotic bounds on the cardinality of its codomain. We perform several studies on the ONEMAX and MAX\(k\)SAT domains to assess the accuracy of the approximation and its dependence on various factors. We show that the approximation can accurately predict the number of states within a Hamming region that have an improving fitness value.

Doerr, Benjamin; Johannsen, Daniel; Kötzing, Timo; Lehre, Per Kristian; Wagner, Markus; Winzen, Carola Faster blackbox algorithms through higher arity operators. Foundations of Genetic Algorithms (FOGA) 2011: 163172
We extend the work of Lehre and Witt (GECCO 2010) on the unbiased blackbox model by considering higher arity variation operators. In particular, we show that already for binary operators the blackbox complexity of LeadingOnes drops from \(\Theta(n^2)\) for unary operators to \(O(n \log n)\). For OneMax, the \(\Omega(n \log n)\) unary blackbox complexity drops to \(O(n)\) in the binary case. For \(k\)ary operators, \(k \le n\), the OneMaxcomplexity further decreases to \(O(n / \log k)\).

Kötzing, Timo; Neumann, Frank; Sudholt, Dirk; Wagner, Markus Simple maxmin ant systems and the optimization of linear pseudoboolean functions. Foundations of Genetic Algorithms (FOGA) 2011: 209218
With this paper, we contribute to the understanding of ant colony optimization (ACO) algorithms by formally analyzing their runtime behavior. We study simple MAXMIN ant systems on the class of linear pseudoBoolean functions defined on binary strings of length n. Our investigations point out how the progress according to function values is stored in the pheromones. We provide a general upper bound of \(O((n^3 \log n)\rho)\) on the running time for two ACO variants on all linear functions, where \(\rho\) determines the pheromone update strength. Furthermore, we show improved bounds for two wellknown linear pseudoBoolean functions called ONEMAX and BINVAL and give additional insights using an experimental study.

Bringmann, Karl; Friedrich, Tobias Convergence of hypervolumebased archiving algorithms I: effectiveness. Genetic and Evolutionary Computation Conference (GECCO) 2011: 745752
Nominated for Best Paper Award (EMO Track)
The core of hypervolumebased multiobjective evolutionary algorithms is an archiving algorithm which performs the environmental selection. A \((\mu+\lambda)\)archiving algorithm defines how to choose \(\mu\) children from \(\mu\) parents and \(\lambda\) offspring together. We study theoretically \((\mu+\lambda)\)archiving algorithms which never decrease the hypervolume from one generation to the next. Zitzler, Thiele, and Bader (IEEE Trans. Evolutionary Computation, 14:5879, 2010) proved that all \((\mu+1)\)archiving algorithms are ineffective, which means there is an initial population such that independent of the used reproduction rule, a set with maximum hypervolume cannot be reached. We extend this and prove that for \(\lambda < \mu\) all archiving algorithms are ineffective. On the other hand, locally optimal algorithms, which maximize the hypervolume in each step, are effective for \(\lambda = \mu\) and can always find a population with hypervolume at least half the optimum for \(\lambda < \mu\). We also prove that there is no hypervolumebased archiving algorithm which can always find a population with hypervolume greater than \(1 / (1 + 0.1338, ( 1/\lambda  1/\mu) )\) times the optimum.

Sutton, Andrew M.; Whitley, Darrell; Howe, Adele E. Mutation rates of the (1+1)EA on pseudoboolean functions of bounded epistasis. Genetic and Evolutionary Computation Conference (GECCO) 2011: 973980
When the epistasis of the fitness function is bounded by a constant, we show that the expected fitness of an offspring of the (1+1)EA can be efficiently computed for any point. Moreover, we show that, for any point, it is always possible to efficiently retrieve the "best" mutation rate at that point in the sense that the expected fitness of the resulting offspring is maximized. On linear functions, it has been shown that a mutation rate of \(1/n\) is provably optimal. On functions where epistasis is bounded by a constant \(k\), we show that for sufficiently high fitness, the commonly used mutation rate of \(1/n\) is also best, at least in terms of maximizing the expected fitness of the offspring. However, we find for certain ranges of the fitness function, a better mutation rate can be considerably higher, and can be found by solving for the real roots of a degree\(k\) polynomial whose coefficients contain the nonzero Walsh coefficients of the fitness function. Simulation results on maximum \(k\)satisfiability problems and NKlandscapes show that this expectationmaximized mutation rate can cause significant gains early in search.

Doerr, Benjamin; Lengler, Johannes; Kötzing, Timo; Winzen, Carola Blackbox complexities of combinatorial problems. Genetic and Evolutionary Computation Conference (GECCO) 2011: 981988
Blackbox complexity is a complexity theoretic measure for how difficult a problem is to be optimized by a general purpose optimization algorithm. It is thus one of the few means trying to understand which problems are tractable for genetic algorithms and other randomized search heuristics. Most previous work on blackbox complexity is on artificial test functions. In this paper, we move a step forward and give a detailed analysis for the two combinatorial problems minimum spanning tree and singlesource shortest paths. Besides giving interesting bounds for their blackbox complexities, our work reveals that the choice of how to model the optimization problem is nontrivial here. This in particular comes true where the search space does not consist of bit strings and where a reasonable definition of unbiasedness has to be agreed on.

Kötzing, Timo; Sudholt, Dirk; Theile, Madeleine How crossover helps in pseudoboolean optimization. Genetic and Evolutionary Computation Conference (GECCO) 2011: 989996
Understanding the impact of crossover on performance is a major problem in the theory of genetic algorithms (GAs). We present new insight on working principles of crossover by analyzing the performance of crossoverbased GAs on the simple functions OneMax and Jump. First, we assess the potential speedup by crossover when combined with a fitnessinvariant bit shuffling operator that simulates a lineage of independent evolution on a function of unitation. Theoretical and empirical results show drastic speedups for both functions. Second, we consider a simple GA without shuffling and investigate the interplay of mutation and crossover on Jump. If the crossover probability is small, subsequent mutations create sufficient diversity, even for very small populations. Contrarily, with high crossover probabilities crossover tends to lose diversity more quickly than mutation can create it. This has a drastic impact on the performance on Jump. We complement our theoretical findings by Monte Carlo simulations on the population diversity.

Doerr, Benjamin; Kötzing, Timo; Winzen, Carola Too fast unbiased blackbox algorithms. Genetic and Evolutionary Computation Conference (GECCO) 2011: 20432050
Unbiased blackbox complexity was recently introduced as a refined complexity model for randomized search heuristics (Lehre and Witt, GECCO 2010). For several problems, this notion avoids the unrealistically low complexity results given by the classical model of Droste, Jansen, and Wegener (Theor. Comput. Sci. 2006). In this work, we show that for two natural problems the unbiased blackbox complexity remains artificially small. For the classical JumpK test function class and for a subclass of the wellknown Partition problem, we give mutationonly unbiased blackbox algorithms having complexity \(O(n \log n)\). Since the first problem usually needs \(\Theta(n^k)\) function evaluations to be optimized by standard heuristics and the second is even NPcomplete, these blackbox complexities seem not to indicate the true difficulty of the two problems for randomized search heuristics.

Kötzing, Timo; Neumann, Frank; Spöhel, Reto PAC learning and genetic programming. Genetic and Evolutionary Computation Conference (GECCO) 2011: 20912096
Genetic programming (GP) is a very successful type of learning algorithm that is hard to understand from a theoretical point of view. With this paper we contribute to the computational complexity analysis of genetic programming that has been started recently. We analyze GP in the wellknown PAC learning framework and point out how it can observe quality changes in the the evolution of functions by random sampling. This leads to computational complexity bounds for a linear GP algorithm for perfectly learning any member of a simple class of linear pseudoBoolean functions. Furthermore, we show that the same algorithm on the functions from the same class finds good approximations of the target function in less time.

Fellows, Michael R.; Friedrich, Tobias; Hermelin, Danny; Narodytska, Nina; Rosamond, Frances A. Constraint Satisfaction Problems: Convexity Makes AllDifferent Constraints Tractable. International Joint Conference on Artificial Intelligence (IJCAI) 2011: 522527
We examine the complexity of constraint satisfaction problems that consist of a set of AllDiff constraints. Such CSPs naturally model a wide range of realworld and combinatorial problems, like scheduling, frequency allocations, and graph coloring problems. As this problem is known to be NPcomplete, we investigate under which further assumptions it becomes tractable. We observe that a crucial property seems to be the convexity of the variable domains and constraints. Our main contribution is an extensive study of the complexity of Multiple AllDiff CSPs for a set of natural parameters, like maximum domain size and maximum size of the constraint scopes. We show that, depending on the parameter, convexity can make the problem tractable even though it is provably intractable in general. Interestingly, the convexity of constraints is the key property in achieving fixed parameter tractability, while the convexity of domains does not usually make the problem easier.

Bringmann, Karl; Friedrich, Tobias; Neumann, Frank; Wagner, Markus ApproximationGuided Evolutionary MultiObjective Optimization. International Joint Conference on Artificial Intelligence (IJCAI) 2011: 11981203
Multiobjective optimization problems arise frequently in applications but can often only be solved approximately by heuristic approaches. Evolutionary algorithms have been widely used to tackle multiobjective problems. These algorithms use different measures to ensure diversity in the objective space but are not guided by a formal notion of approximation. We present a new framework of an evolutionary algorithm for multiobjective optimization that allows to work with a formal notion of approximation. Our experimental results show that our approach outperforms stateoftheart evolutionary algorithms in terms of the quality of the approximation that is obtained in particular for problems with many objectives.

Friedrich, Tobias; Sauerwald, Thomas; Stauffer, Alexandre Diameter and Broadcast Time of Random Geometric Graphs in Arbitrary Dimensions. International Symposium on Algorithms and Computation (ISAAC) 2011: 190199
A random geometric graph (RGG) is defined by placing \(n\) points uniformly at random in \([0,n^{1/d}]^d\), and joining two points by an edge whenever their Euclidean distance is at most some fixed \(r\). We assume that \(r\) is larger than the critical value for the emergence of a connected component with \(\Omega(n)\) nodes. We show that, with high probability (w.h.p.), for any two connected nodes with a Euclidean distance of \(\omega(\log n / r^{d1})\), their graph distance is only a constant factor larger than their Euclidean distance. This implies that the diameter of the largest connected component is \(\Theta(n^{1/d}/r)\) w.h.p. We also prove that the condition on the Euclidean distance above is essentially tight. We also analyze the following randomized broadcast algorithm on RGGs. At the beginning, only one node from the largest connected component of the RGG is informed. Then, in each round, each informed node chooses a neighbor independently and uniformly at random and informs it. We prove that w.h.p. this algorithm informs every node in the largest connected component of an RGG within \(\Theta(n^1/d/r+\log n)\) rounds.

Lenzner, Pascal On Dynamics in Basic Network Creation Games. Symposium on Algorithmic Game Theory (SAGT) 2011: 254265
We initiate the study of game dynamics in the Sum Basic Network Creation Game, which was recently introduced by Alon et al.[SPAA'10]. In this game players are associated to vertices in a graph and are allowed to "swap" edges, that is to remove an incident edge and insert a new incident edge. By performing such moves, every player tries to minimize her connection cost, which is the sum of distances to all other vertices. When played on a tree, we prove that this game admits an ordinal potential function, which implies guaranteed convergence to a pure Nash Equilibrium. We show a cubic upper bound on the number of steps needed for any improving response dynamic to converge to a stable tree and propose and analyse a best response dynamic, where the players having the highest cost are allowed to move. For this dynamic we show an almost tight linear upper bound for the convergence speed. Furthermore, we contrast these positive results by showing that, when played on general graphs, this game allows best response cycles. This implies that there cannot exist an ordinal potential function and that fundamentally different techniques are required for analysing this case. For computing a best response we show a similar contrast: On the one hand we give a lineartime algorithm for computing a best response on trees even if players are allowed to swap multiple edges at a time. On the other hand we prove that this task is NPhard even on simple general graphs, if more than one edge can be swapped at a time. The latter addresses a proposal by Alon et al..

Berenbrink, Petra; Cooper, Colin; Friedetzky, Tom; Friedrich, Tobias; Sauerwald, Thomas Randomized Diffusion for Indivisible Loads. Symposium on Discrete Algorithms (SODA) 2011: 429439
We present a new randomized diffusionbased algorithm for balancing indivisible tasks (tokens) on a network. Our aim is to minimize the discrepancy between the maximum and minimum load. The algorithm works as follows. Every vertex distributes its tokens as evenly as possible among its neighbors and itself. If this is not possible without splitting some tokens, the vertex redistributes its excess tokens among all its neighbors randomly (without replacement). In this paper we prove several upper bounds on the load discrepancy for general networks. These bounds depend on some expansion properties of the network, that is, the second largest eigenvalue, and a novel measure which we refer to as refined local divergence. We then apply these general bounds to obtain results for some specific networks. For constantdegree expanders and torus graphs, these yield exponential improvements on the discrepancy bounds compared to the algorithm of Rabani, Sinclair, and Wanka. For hypercubes we obtain a polynomial improvement. In contrast to previous papers, our algorithm is vertexbased and not edgebased. This means excess tokens are assigned to vertices instead to edges, and the vertex reallocates all of its excess tokens by itself. This approach avoids nodes having "negative loads", but causes additional dependencies for the analysis.

Case, John; Kötzing, Timo Measuring Learning Complexity with Criteria Epitomizers. Symposium on Theoretical Aspects of Computer Science (STACS) 2011: 320331
In prior papers, beginning with the seminal work by Freivalds et al. 1995, the notion of intrinsic complexity is used to analyze the learning complexity of sets of functions in a Goldstyle learning setting. Herein are pointed out some weaknesses of this notion. Offered is an alternative based on epitomizing sets of functions  sets, which are learnable under a given learning criterion, but not under other criteria which are not at least as powerful. To capture the idea of epitomizing sets, new reducibility notions are given based on robust learning (closure of learning under certain classes of operators). Various degrees of epitomizing sets are characterized as the sets complete with respect to corresponding reducibility notions! These characterizations also provide an easy method for showing sets to be epitomizers, and they are, then, employed to prove several sets to be epitomizing. Furthermore, a scheme is provided to generate easily very strong epitomizers for a multitude of learning criteria. These strong epitomizers are socalled selflearning sets, previously applied by Case & Koetzing, 2010. These strong epitomizers can be generated and employed in a myriad of settings to witness the strict separation in learning power between the criteria so epitomized and other not as powerful criteria!

Antoniadis, Antonios; Hüffner, Falk; Lenzner, Pascal; Moldenhauer, Carsten; Souza, Alexander Balanced Interval Coloring. Symposium on Theoretical Aspects of Computer Science (STACS) 2011: 531542
We consider the discrepancy problem of coloring n intervals with \(k\) colors such that at each point on the line, the maximal difference between the number of intervals of any two colors is minimal. Somewhat surprisingly, a coloring with maximal difference at most one always exists. Furthermore, we give an algorithm with running time \(O(n \log n + k n \log k)\) for its construction. This is in particular interesting because many known results for discrepancy problems are nonconstructive. This problem naturally models a load balancing scenario, where \(n\) tasks with given start and endtimes have to be distributed among \(k\) servers. Our results imply that this can be done ideally balanced. When generalizing to \(d\)dimensional boxes (instead of intervals), a solution with difference at most one is not always possible. We show that for any \(d \ge 2\) and any \(k \ge 2\) it is NPcomplete to decide if such a solution exists, which implies also NPhardness of the respective minimization problem. In an online scenario, where intervals arrive over time and the color has to be decided upon arrival, the maximal difference in the size of color classes can become arbitrarily high for any online algorithm.

Doerr, Benjamin; Fouz, Mahmoud; Friedrich, Tobias Social networks spread rumors in sublogarithmic time. Symposium on Theory of Computing (STOC) 2011: 2130
With the prevalence of social networks, it has become increasingly important to understand their features and limitations. It has been observed that information spreads extremely fast in social networks. We study the performance of randomized rumor spreading protocols on graphs in the preferential attachment model. The wellknown random phone call model of Karp et al. (FOCS 2000) is a pushpull strategy where in each round, each vertex chooses a random neighbor and exchanges information with it. We prove the following.  The pushpull strategy delivers a message to all nodes within \(\Theta(\log n)\) rounds with high probability. The best known bound so far was \(O(\log^2 n)\).  If we slightly modify the protocol so that contacts are chosen uniformly from all neighbors but the one contacted in the previous round, then this time reduces to \(\Theta(\log n / \log \log n)\), which is the diameter of the graph. This is the first time that a sublogarithmic broadcast time is proven for a natural setting. Also, this is the first time that avoiding doublecontacts reduces the runtime to a smaller order of magnitude.