
Kötzing, Timo; Palenta, Raphaela A Map of Update Constraints in Inductive Inference. Algorithmic Learning Theory (ALT) 2014: 4054
We investigate how different learning restrictions reduce learning power and how the different restrictions relate to one another. We give a complete map for nine different restrictions both for the cases of complete information learning and setdriven learning. This completes the picture for these wellstudied delayable learning restrictions. A further insight is gained by different characterizations of conservative learning in terms of variants of cautious learning. Our analyses greatly benefit from general theorems we give, for example showing that learners with exclusively delayable restrictions can always be assumed total.

Jain, Sanjay; Kötzing, Timo; Ma, Junqi; Stephan, Frank On the Role of Update Constraints and TextTypes in Iterative Learning. Algorithmic Learning Theory (ALT) 2014: 5569
The present work investigates the relationship of iterative learning with other learning criteria such as decisiveness, caution, reliability, nonUshapedness, monotonicity, strong monotonicity and conservativeness. Building on the result of Case and Moelius that iterative learners can be made nonUshaped, we show that they also can be made cautious and decisive. Furthermore, we obtain various special results with respect to oneone texts, fat texts and oneone hypothesis spaces.

Göbel, Andreas; Goldberg, Leslie Ann; McQuillan, Colin; Richerby, David; Yamakami, Tomoyuki Counting List Matrix Partitions of Graphs. Conference on Computational Complexity (CCC) 2014: 5665
Given a symmetric \(D \times D\) matrix \(M\) over \(0, 1, *}\), a list \(M\)partition of a graph \(G\) is a partition of the vertices of \(G\) into \(D\) parts which are associated with the rows of \(M\). The part of each vertex is chosen from a given list in such a way that no edge of \(G\) is mapped to a 0 in \(M\) and no nonedge of \(G\) is mapped to a 1 in \(M\). Many important graphtheoretic structures can be represented as list \(M\)partitions including graph colourings, split graphs and homogeneous sets, which arise in the proofs of the weak and strong perfect graph conjectures. Thus, there has been quite a bit of work on determining for which matrices \(M\) computations involving list \(M\)partitions are tractable. This paper focuses on the problem of counting list \(M\)partitions, given a graph \(G\) and given lists for each vertex of \(G\). We give an algorithm that solves this problem in polynomial time for every (fixed) matrix \(M\) for which the problem is tractable. The algorithm relies on data structures such as sparsedense partitions and sub cube decompositions to reduce each problem instance to a sequence of problem instances in which the lists have a certain useful structure that restricts access to portions of \(M\) in which the interactions of 0s and 1s is controlled. We show how to solve the resulting restricted instances by converting them into particular counting constraint satisfaction problems (\#CSPs) which we show how to solve using a constraint satisfaction technique known as "arcconsistency". For every matrix \(M\) for which our algorithm fails, we show that the problem of counting list \(M\)partitions is #Pcomplete. Furthermore, we give an explicit characterisation of the dichotomy theorem  counting list \(M\)partitions is tractable (in FP) if and only if the matrix \(M\) has a structure called a derectangularising sequence. Finally, we show that the metaproblem of determining whether a given matrix has a derectangularising sequence is NPcomplete.

Bläsius, Thomas; Brückner, Guido; Rutter, Ignaz Complexity of HigherDegree Orthogonal Graph Embedding in the Kandinsky Model. European Symposium on Algorithms (ESA) 2014: 161172
We show that finding orthogonal gridembeddings of plane graphs (planar with fixed combinatorial embedding) with the minimum number of bends in the socalled Kandinsky model (which allows vertices of degree >4) is NPcomplete, thus solving a longstanding open problem. On the positive side, we give an efficient algorithm for several restricted variants, such as graphs of bounded branch width and a subexponential exact algorithm for general plane graphs.

Bringmann, Karl; Friedrich, Tobias; Krohmer, Anton Deanonymization of Heterogeneous Random Graphs in Quasilinear Time. European Symposium on Algorithms (ESA) 2014: 197208
There are hundreds of online social networks with billions of users in total. Many such networks publicly release structural information, with all personal information removed. Empirical studies have shown, however, that this provides a false sense of privacy  it is possible to identify almost all users that appear in two such anonymized network as long as a few initial mappings are known. We analyze this problem theoretically by reconciling two versions of an artificial powerlaw network arising from independent subsampling of vertices and edges. We present a new algorithm that identifies most vertices and makes no wrong identifications with high probability. The number of vertices matched is shown to be asymptotically optimal. For an \(n\)vertex graph, our algorithm uses \(n^\epsilon\) seed nodes (for an arbitrarily small \(\epsilon\)) and runs in quasilinear time. This improves previous theoretical results which need \(\Theta(n)\) seed nodes and have runtimes of order \(n^{1 + \Omega(1)}\). Additionally, the applicability of our algorithm is studied experimentally on different networks.

Casel, Katrin A FixedParameter Approach for PrivacyProtection with Global Recoding. Frontiers in Algorithmics (FAW) 2014: 2535
This paper discusses a problem arising in the field of privacyprotection in statistical databases: Given a \(n \times m\) \(\{0,1\}\)matrix \(M\), is there a set of mergings which transforms \(M\) into a zero matrix and only affects a bounded number of rows/columns. “Merging” here refers to combining adjacent lines with a componentwise logical AND. This kind transformation models a generalization on OLAPcubes also called global recoding. Counting the number of affected lines presents a new measure of informationloss for this method. Parameterized by the number of affected lines \(k\) we introduce reduction rules and an \(O^*(2.618^k)\)algorithm for the new abstract combinatorial problem LMAL.

Bläsius, Thomas; Rutter, Ignaz A New Perspective on Clustered Planarity as a Combinatorial Embedding Problem. Graph Drawing (GD) 2014: 440451
The clustered planarity problem (\(c\)planarity) asks whether a hierarchically clustered graph admits a planar drawing such that the clusters can be nicely represented by regions. We introduce the \(cd\)tree data structure and give a new characterization of \(c\)planarity. It leads to efficient algorithms for \(c\)planarity testing in the following cases. (i) Every cluster and every cocluster has at most two connected components. (ii) Every cluster has at most five outgoing edges. Moreover, the cdtree reveals interesting connections between \(c\)planarity and planarity with constraints on the order of edges around vertices. On one hand, this gives rise to a bunch of new open problems related to \(c\)planarity, on the other hand it provides a new perspective on previous results.

Chicano, Francisco; Whitley, Darrell; Sutton, Andrew M. Efficient identification of improving moves in a ball for pseudoboolean problems. Genetic and Evolutionary Computation Conference (GECCO) 2014: 437444
Hill climbing algorithms are at the core of many approaches to solve optimization problems. Such algorithms usually require the complete enumeration of a neighborhood of the current solution. In the case of problems defined over binary strings of length \(n\), we define the \(r\)ball neighborhood as the set of solutions at Hamming distance \(r\) or less from the current solution. For \(r \ll n\) this neighborhood contains \(\Theta(n^r)\) solutions. In this paper efficient methods areintroduced to locate improving moves in the \(r\)ball neighborhood for problems that can be written as a sum of a linear number of subfunctions depending on a bounded number of variables. NKlandscapes and MAX\(k\)SAT are examples of these problems. If the number of subfunctions depending on any given variable is also bounded, then we prove that the method can explore the neighborhood in constant time, despite the fact that the number of solutions in the neighborhood is polynomial in \(n\). We develop a hill climber based on our exploration method and we analyze its efficiency and efficacy using experiments with NKqlandscapes instances.

Bringmann, Karl; Friedrich, Tobias; Klitzke, Patrick Twodimensional subset selection for hypervolume and epsilonindicator. Genetic and Evolutionary Computation Conference (GECCO) 2014: 589596
The goal of biobjective optimization is to find a small set of good compromise solutions. A common problem for biobjective evolutionary algorithms is the following subset selection problem (SSP): Given \(n\) solutions \(P \subset \mathbb{R}^2\) in the objective space, select \(k\) solutions \(P^*\) from \(P\) that optimize an indicator function. In the hypervolume SSP we want to select \(k\) points \(P^*\) that maximize the hypervolume indicator \(I_{HYP}(P^*, r)\) for some reference point \(r \in R^2\). Similarly, the \(\epsilon\)indicator SSP aims at selecting \(\tilde k\) points \(P^*\) that minimize the \(\epsilon\)indicator \(I_{\epsilon}(P^*,\mathbb{R})\) for some reference set \(R \subset R^2\) of size \(m\) (which can be \(\mathbb{R}=P\)). We first present a new algorithm for the hypervolume SSP with runtime \(O(n (k + \log n))\). Our second main result is a new algorithm for the \(\epsilon\)indicator SSP with runtime \(O(n \log n + m \log m)\). Both results improve the current state of the art runtimes by a factor of (nearly) \(n\) and make the problems tractable for new applications. Preliminary experiments confirm that the theoretical results translate into substantial empirical runtime improvements.

Doerr, Benjamin; Doerr, Carola; Kötzing, Timo Unbiased blackbox complexities of jump functions: how to cross large plateaus. Genetic and Evolutionary Computation Conference (GECCO) 2014: 769776
We analyze the unbiased blackbox complexity of jump functions with large jump sizes. Among other results, we show that when the jump size is \((1/2  \epsilon)n\), that is, only a small constant fraction of the fitness values is visible, then the unbiased blackbox complexities for arities 3 and higher are of the same order as those for the simple OneMax function. Even for the extreme jump function, in which all but the two fitness values \(n/2\) and \(n\) are blanked out, polynomialtime mutationbased (i.e., unary unbiased) blackbox optimization algorithms exist. This is quite surprising given that for the extreme jump function almost the whole search space (all but a \(\Theta(n^{1/2})\) fraction) is a plateau of constant fitness.To prove these results, we introduce new tools for the analysis of unbiased blackbox complexities, for example, selecting the new parent individual not by comparing the fitnesses of the competing search points, but also by taking into account the (empirical) expected fitnesses of their offspring.

Gießen, Christian; Kötzing, Timo Robustness of populations in stochastic environments. Genetic and Evolutionary Computation Conference (GECCO) 2014: 13831390
We consider stochastic versions of OneMax and LeadingOnes and analyze the performance of evolutionary algorithms with and without populations on these problems. It is known that the (1+1) EA on OneMax performs well in the presence of very small noise, but poorly for higher noise levels. We extend these results to LeadingOnes and to many different noise models, showing how the application of drift theory can significantly simplify and generalize previous analyses. Most surprisingly, even small populations (of size \(\Theta(\log n)\)) can make evolutionary algorithms perform well for high noise levels, well outside the abilities of the (1+1) EA. Larger population sizes are even more beneficial; we consider both parent and offspring populations. In this sense, populations are robust in these stochastic settings.

Kötzing, Timo Concentration of first hitting times under additive drift. Genetic and Evolutionary Computation Conference (GECCO) 2014: 13911398
Recent advances in drift analysis have given us better and better tools for understanding random processes, including the run time of randomized search heuristics. In the setting of multiplicative drift we do not only have excellent bounds on the expected run time, but also more general results showing the concentration of the run time. In this paper we investigate the setting of additive drift under the assumption of strong concentration of the "step size" of the process. Under sufficiently strong drift towards the goal we show a strong concentration of the hitting time. In contrast to this, we show that in the presence of small drift a Gambler'sRuinlike behavior of the process overrides the influence of the drift. Finally, in the presence of sufficiently strong negative drift the hitting time is superpolynomial with high probability; this corresponds to the socalled negative drift theorem, for which we give new variants.

Bringmann, Karl; Friedrich, Tobias; Klitzke, Patrick Generic Postprocessing via Subset Selection for Hypervolume and EpsilonIndicator. Parallel Problem Solving from Nature (PPSN) 2014: 518527
Most biobjective evolutionary algorithms maintain a population of fixed size \(\mu\) and return the final population at termination. During the optimization process many solutions are considered, but most are discarded. We present two generic postprocessing algorithms which utilize the archive of all nondominated solutions evaluated during the search. We choose the best \(\mu\) solutions from the archive such that the hypervolume or \(\epsilon\)indicator is maximized. This postprocessing costs no additional fitness function evaluations and has negligible runtime compared to most EMOAs. We experimentally examine our postprocessing for four standard algorithms (NSGAII, SPEA2, SMSEMOA, IBEA) on ten standard test functions (DTLZ 12,7, ZDT 13, WFG 36) and measure the average quality improvement. The median decrease of the distance to the optimal \(\epsilon\)indicator is \(95\%\), the median decrease of the distance to the optimal hypervolume value is \(86\%\). We observe similar performance on a realworld problem (wind turbine placement).

Friedrich, Tobias; Neumann, Frank Maximizing Submodular Functions under Matroid Constraints by Multiobjective Evolutionary Algorithms. Parallel Problem Solving from Nature (PPSN) 2014: 922931
Nominated for Best Paper Award
Many combinatorial optimization problems have underlying goal functions that are submodular. The classical goal is to find a good solution for a given submodular function f under a given set of constraints. In this paper, we investigate the runtime of a multiobjective evolutionary algorithm called GSEMO until it has obtained a good approximation for submodular functions. For the case of monotone submodular functions and uniform cardinality constraints we show that GSEMO achieves a \((1  1/e)\)approximation in expected time \(O(n^2(\log n+k))\), where \(k\) is the value of the given constraint. For the case of nonmonotone submodular functions with \(k\) matroid intersection constraints, we show that GSEMO achieves a \((1/(k + 2 + 1/k + \epsilon)\)approximation in expected time \(O(n^{k+5\log(n)/\epsilon)\).

Sutton, Andrew M.; Neumann, Frank Runtime Analysis of Evolutionary Algorithms on Randomly Constructed HighDensity Satisfiable 3CNF Formulas. Parallel Problem Solving from Nature (PPSN) 2014: 942951
We show that simple mutationonly evolutionary algorithms find a satisfying assignment on two similar models of random planted 3CNF Boolean formulas in polynomial time with high probability in the high constraint density regime. We extend the analysis to random formulas conditioned on satisfiability (i.e., the socalled filtered distribution) and conclude that most highdensity satisfiable formulas are easy for simple evolutionary algorithms. With this paper, we contribute the first rigorous study of randomized search heuristics from the evolutionary computation community on wellstudied distributions of random satisfiability problems.

Göbel, Andreas; Goldberg, Leslie Ann; Richerby, David Counting Homomorphisms to Cactus Graphs Modulo 2. Symposium on Theoretical Aspects of Computer Science (STACS) 2014: 350361
A homomorphism from a graph \(G\) to a graph \(H\) is a function from \(V(G)\) to \(V(H)\) that preserves edges. Many combinatorial structures that arise in mathematics and in computer science can be represented naturally as graph homomorphisms and as weighted sums of graph homomorphisms. In this article, we study the complexity of counting homomorphisms modulo 2. The complexity of modular counting was introduced by Papadimitriou and Zachos and it has been pioneered by Valiant who famously introduced a problem for which counting modulo 7 is easy but counting modulo 2 is intractable. Modular counting provides a rich setting in which to study the structure of homomorphism problems. In this case, the structure of the graph \(H\) has a big influence on the complexity of the problem. Thus, our approach is graphtheoretic. We give a complete solution for the class of cactus graphs, which are connected graphs in which every edge belongs to at most one cycle. Cactus graphs arise in many applications such as the modelling of wireless sensor networks and the comparison of genomes. We show that, for some cactus graphs \(H\), counting homomorphisms to \(H\) modulo 2 can be done in polynomial time. For every other fixed cactus graph \(H\), the problem is complete in the complexity class \(\oplus P\), which is a wide complexity class to which every problem in the polynomial hierarchy can be reduced (using randomised reductions). Determining which \(H\) lead to tractable problems can be done in polynomial time. Our result builds upon the work of Faben and Jerrum, who gave a dichotomy for the case in which \(H\) is a tree.

Kötzing, Timo A Solution to Wiehagen's Thesis. Symposium on Theoretical Aspects of Computer Science (STACS) 2014: 494505
Wiehagen's Thesis in Inductive Inference (1991) essentially states that, for each learning criterion, learning can be done in a normalized, enumerative way. The thesis was not a formal statement and thus did not allow for a formal proof, but support was given by examples of a number of different learning criteria that can be learned by enumeration. Building on recent formalizations of learning criteria, we are now able to formalize Wiehagen's Thesis. We prove the thesis for a wide range of learning criteria, including many popular criteria from the literature. We also show the limitations of the thesis by giving four learning criteria for which the thesis does not hold (and, in two cases, was probably not meant to hold). Beyond the original formulation of the thesis, we also prove stronger versions which allow for many corollaries relating to strongly decisive and conservative learning.