Clean Citation Style 002
{ "authors" : [{ "lastname":"Bläsius" , "initial":"T" , "url":"https://hpi.de/friedrich/publications/people/thomas-blaesius.html" , "mail":"thomas.blasius(at)hpi.de" }, { "lastname":"Casel" , "initial":"K" , "url":"https://hpi.de/friedrich/publications/people/katrin-casel.html" , "mail":"katrin.casel(at)hpi.de" }, { "lastname":"Chauhan" , "initial":"A" , "url":"https://hpi.de/friedrich/publications/people/ankit-chauhan.html" , "mail":"ankit.chauhan(at)hpi.de" }, { "lastname":"Cohen" , "initial":"S" , "url":"https://hpi.de/friedrich/publications/people/sarel-cohen.html" , "mail":"sarel.cohen(at)hpi.de" }, { "lastname":"Cseh" , "initial":"" , "url":"https://hpi.de/friedrich/publications/people/agnes-cseh.html" , "mail":"agnes.cseh(at)hpi.de" }, { "lastname":"Doskoč" , "initial":"V" , "url":"https://hpi.de/friedrich/publications/people/vanja-doskoc.html" , "mail":"vanja.doskoc(at)hpi.de" }, { "lastname":"Elijazyfer" , "initial":"Z" , "url":"https://hpi.de/friedrich/people/ziena-elijazyfer.html" , "mail":"ziena.elijazyfer(at)hpi.de" }, { "lastname":"Fischbeck" , "initial":"P" , "url":"https://hpi.de/friedrich/publications/people/philipp-fischbeck.html" , "mail":"philipp.fischbeck(at)hpi.de" }, { "lastname":"Friedrich" , "initial":"T" , "url":"https://hpi.de/friedrich/publications/people/tobias-friedrich.html" , "mail":"friedrich(at)hpi.de" }, { "lastname":"Göbel" , "initial":"A" , "url":"https://hpi.de/friedrich/publications/people/andreas-goebel.html" , "mail":"andreas.goebel(at)hpi.de" }, { "lastname":"Issac" , "initial":"D" , "url":"https://hpi.de/friedrich/publications/people/davis-issac.html" , "mail":"davis.issac(at)hpi.de" }, { "lastname":"Katzmann" , "initial":"M" , "url":"https://hpi.de/friedrich/publications/people/maximilian-katzmann.html" , "mail":"maximilian.katzmann(at)hpi.de" }, { "lastname":"Khazraei" , "initial":"A" , "url":"https://hpi.de/friedrich/publications/people/ardalan-khazraei.html" , "mail":"ardalan.khazraei(at)hpi.de" }, { "lastname":"Kötzing" , "initial":"T" , "url":"https://hpi.de/friedrich/publications/people/timo-koetzing.html" , "mail":"timo.koetzing(at)hpi.de" }, { "lastname":"Krejca" , "initial":"M" , "url":"https://hpi.de/friedrich/publications/people/martin-krejca.html" , "mail":"martin.krejca(at)hpi.de" }, { "lastname":"Krogmann" , "initial":"S" , "url":"https://hpi.de/friedrich/publications/people/simon-krogmann.html" , "mail":"simon.krogmann(at)hpi.de" }, { "lastname":"Krohmer" , "initial":"A" , "url":"https://hpi.de/friedrich/publications/people/anton-krohmer.html" , "mail":"none" }, { "lastname":"Kumar" , "initial":"N" , "url":"https://hpi.de/friedrich/publications/people/nikhil-kumar.html" , "mail":"nikhil.kumar(at)hpi.de" }, { "lastname":"Lagodzinski" , "initial":"G" , "url":"https://hpi.de/friedrich/publications/people/gregor-lagodzinski.html" , "mail":"gregor.lagodzinski(at)hpi.de" }, { "lastname":"Lenzner" , "initial":"P" , "url":"https://hpi.de/friedrich/publications/people/pascal-lenzner.html" , "mail":"pascal.lenzner(at)hpi.de" }, { "lastname":"Melnichenko" , "initial":"A" , "url":"https://hpi.de/friedrich/publications/people/anna-melnichenko.html" , "mail":"anna.melnichenko(at)hpi.de" }, { "lastname":"Molitor" , "initial":"L" , "url":"https://hpi.de/friedrich/publications/people/louise-molitor.html" , "mail":"louise.molitor(at)hpi.de" }, { "lastname":"Neubert" , "initial":"S" , "url":"https://hpi.de/friedrich/publications/people/stefan-neubert.html" , "mail":"stefan.neubert(at)hpi.de" }, { "lastname":"Pappik" , "initial":"M" , "url":"https://hpi.de/friedrich/publications/people/marcus-pappik.html" , "mail":"marcus.pappik(at)hpi.de" }, { "lastname":"Quinzan" , "initial":"F" , "url":"https://hpi.de/friedrich/publications/people/francesco-quinzan.html" , "mail":"francesco.quinzan(at)hpi.de" }, { "lastname":"Rizzo" , "initial":"M" , "url":"https://hpi.de/friedrich/publications/people/manuel-rizzo.html" , "mail":"manuel.rizzo(at)hpi.de" }, { "lastname":"Rothenberger" , "initial":"R" , "url":"https://hpi.de/friedrich/publications/people/ralf-rothenberger.html" , "mail":"ralf.rothenberger(at)hpi.de" }, { "lastname":"Schirneck" , "initial":"M" , "url":"https://hpi.de/friedrich/publications/people/martin-schirneck.html" , "mail":"martin.schirneck(at)hpi.de" }, { "lastname":"Seidel" , "initial":"K" , "url":"https://hpi.de/friedrich/publications/people/karen-seidel.html" , "mail":"karen.seidel(at)hpi.de" }, { "lastname":"Sutton" , "initial":"A" , "url":"https://hpi.de/friedrich/publications/people/andrew-m-sutton.html" , "mail":"none" }, { "lastname":"Weyand" , "initial":"C" , "url":"https://hpi.de/friedrich/publications/people/christopher-weyand.html" , "mail":"none" }]}
Birnick, Johann; Bläsius, Thomas; Friedrich, Tobias; Naumann, Felix; Papenbrock, Thorsten; Schirneck, MartinHitting Set Enumeration with Partial Information for Unique Column Combination Discovery. Proceedings of the VLDB Endowment 2020: 2270 - 2283
Unique column combinations (UCCs) are a fundamental concept in relational databases. They identify entities in the data and support various data management activities. Still, UCCs are usually not explicitly defined and need to be discovered. State-of-the-art data profiling algorithms are able to efficiently discover UCCs in moderately sized datasets, but they tend to fail on large and, in particular, on wide datasets due to run time and memory limitations. In this paper, we introduce HPIValid, a novel UCC discovery algorithm that implements a faster and more resource-saving search strategy. HPIValid models the metadata discovery as a hitting set enumeration problem in hypergraphs. In this way, it combines efficient discovery techniques from data profiling research with the most recent theoretical insights into enumeration algorithms. Our evaluation shows that HPIValid is not only orders of magnitude faster than related work, it also has a much smaller memory footprint.
Bläsius, Thomas; Friedrich, Tobias; Schirneck, MartinThe Minimization of Random Hypergraphs. European Symposium on Algorithms (ESA) 2020: 21:1-21:15
We investigate the maximum-entropy model \(\mathcal{B}_{n,m,p}\) for random \(n\)-vertex, \(m\)-edge multi-hypergraphs with expected edge size \(pn\). We show that the expected size of the minimization \(\min(\mathcal{B}_{n,m,p})\), i.e., the number of inclusion-wise minimal edges of \(\mathcal{B}_{n,m,p}\), undergoes a phase transition with respect to \(m\). If \(m\) is at most \(1/(1-p)^(1-p)n}\), then \(\mathrm{E[|\min(\mathcal{B}_{n,m,p})|]\) is of order \(\Theta(m)\), while for \(m ge 1/(1-p)^(1-p+\varepsilon)n}\) for any \(\varepsilon > 0\), it is \(\Theta( 2^(\mathrm{H(\alpha) + (1-\alpha) \log_2 p) n/ \sqrt{n})\). Here, \(\mathrm{H}\) denotes the binary entropy function and \(alpha = - (\log_{1-p m)/n\). The result implies that the maximum expected number of minimal edges over all \(m\) is \(\Theta((1+p)^n/\sqrt{n})\). Our structural findings have algorithmic implications for minimizing an input hypergraph. This has applications in the profiling of relational databases as well as for the Orthogonal Vectors problem studied in fine-grained complexity. We make several technical contributions that are of independent interest in probability. First, we improve the Chernoff--Hoeffding theorem on the tail of the binomial distribution. In detail, we show that for a binomial variable \(Y sim \mathrm{Bin(n,p)\) and any \(0 < x < p\), it holds that \(\mathrm{P[Y le xn] = \Theta( 2^-\!\mathrm{D(x \,\|}\, p) n}/\sqrt{n})\), where \(\mathrm{D}\) is the binary Kullback--Leibler divergence between Bernoulli distributions. We give explicit upper and lower bounds on the constants hidden in the big-O notation that hold for all \(n\). Secondly, we establish the fact that the probability of a set of cardinality \(i\) being minimal after \(m\) i.i.d. maximum-entropy trials exhibits a sharp threshold behavior at \(i^* = n + \log_{1-p m\).
Shi, Feng; Schirneck, Martin; Friedrich, Tobias; Kötzing, Timo; Neumann, FrankCorrection to: Reoptimization Time Analysis of Evolutionary Algorithms on Linear Functions Under Dynamic Uniform Constraints. Algorithmica 2020: 3117-3123
In the article "Reoptimization Time Analysis of Evolutionary Algorithms on Linear Functions Under Dynamic Uniform Constraints", we claimed a worst-case runtime of \(O(nD \log D)\) and \(O(nD)\) for the Multi-Objective Evolutionary Algorithm and the Multi-Objective \((\mu+(\lambda, \lambda))\) Genetic Algorithm, respectively, on linear profit functions under dynamic uniform constraint, where \(D = |B − B^*|\) denotes the difference between the original constraint bound \(B\) and the new one \(B^*\) . The technique used to prove these results contained an error. We correct this mistake and show a weaker bound of \(O(nD^2)\) for both algorithms instead.
Friedrich, Tobias; Kötzing, Timo; Lagodzinski, J. A. Gregor; Neumann, Frank; Schirneck, MartinAnalysis of the (1+1) EA on Subclasses of Linear Functions under Uniform and Linear Constraints. Theoretical Computer Science 2020: 3-19
Linear functions have gained great attention in the run time analysis of evolutionary computation methods. The corresponding investigations have provided many effective tools for analyzing more complex problems. So far, the runtime analysis of evolutionary algorithms has mainly focused on unconstrained problems, but problems occurring in applications frequently involve constraints. Therefore, there is a strong need to extend the methods for analyzing unconstrained problems to a setting involving constraints. In this paper, we consider the behavior of the classical (1+1) evolutionary algorithm on linear functions under linear constraint. We show tight bounds in the case where the constraint is given by the OneMax function and the objective function is given by either the OneMax or the BinVal function. For the general case we present upper and lower bounds.
Bläsius, Thomas; Fischbeck, Philipp; Friedrich, Tobias; Schirneck, MartinUnderstanding the Effectiveness of Data Reduction in Public Transportation Networks. Workshop on Algorithms and Models for the Web Graph (WAW) 2019: 87-101
Given a public transportation network of stations and connections, we want to find a minimum subset of stations such that each connection runs through a selected station. Although this problem is NP-hard in general, real-world instances are regularly solved almost completely by a set of simple reduction rules. To explain this behavior, we view transportation networks as hitting set instances and identify two characteristic properties, locality and heterogeneity. We then devise a randomized model to generate hitting set instances with adjustable properties. While the heterogeneity does influence the effectiveness of the reduction rules, the generated instances show that locality is the significant factor. Beyond that, we prove that the effectiveness of the reduction rules is independent of the underlying graph structure. Finally, we show that high locality is also prevalent in instances from other domains, facilitating a fast computation of minimum hitting sets.
Bläsius, Thomas; Friedrich, Tobias; Lischeid, Julius; Meeks, Kitty; Schirneck, MartinEfficiently Enumerating Hitting Sets of Hypergraphs Arising in Data Profiling. Algorithm Engineering and Experiments (ALENEX) 2019: 130-143
We devise an enumeration method for inclusion-wise minimal hitting sets in hypergraphs. It has delay \(O(m^{k^\ast+1} \cdot n^2)\) and uses linear space. Hereby, \(n\) is the number of vertices, \(m\) the number of hyperedges, and \(k^\ast\) the rank of the transversal hypergraph. In particular, on classes of hypergraphs for which the cardinality \(k^\ast\) of the largest minimal hitting set is bounded, the delay is polynomial. The algorithm solves the extension problem for minimal hitting sets as a subroutine. We show that the extension problem is W[3]-complete when parameterised by the cardinality of the set which is to be extended. For the subroutine, we give an algorithm that is optimal under the exponential time hypothesis. Despite these lower bounds, we provide empirical evidence showing that the enumeration outperforms the theoretical worst-case guarantee on hypergraphs arising in the profiling of relational databases, namely, in the detection of unique column combinations.
Shi, Feng; Schirneck, Martin; Friedrich, Tobias; Kötzing, Timo; Neumann, FrankReoptimization Time Analysis of Evolutionary Algorithms on Linear Functions Under Dynamic Uniform Constraints. Algorithmica 2019: 828-857
Rigorous runtime analysis is a major approach towards understanding evolutionary computing techniques, and in this area linear pseudo-Boolean objective functions play a central role. Having an additional linear constraint is then equivalent to the NP-hard Knapsack problem, certain classes thereof have been studied in recent works. In this article, we present a dynamic model of optimizing linear functions under uniform constraints. Starting from an optimal solution with respect to a given constraint bound, we investigate the runtimes that different evolutionary algorithms need to recompute an optimal solution when the constraint bound changes by a certain amount. The classical \((1+1)\) EA and several population-based algorithms are designed for that purpose, and are shown to recompute efficiently. Furthermore, a variant of the \((1+(\lambda,\lambda))\) GA for the dynamic optimization problem is studied, whose performance is better when the change of the constraint bound is small.
Doerr, Benjamin; Fischbeck, Philipp; Frahnow, Clemens; Friedrich, Tobias; Kötzing, Timo; Schirneck, MartinIsland Models Meet Rumor Spreading. Algorithmica 2019: 886-915
Island models in evolutionary computation solve problems by a careful interplay of independently running evolutionary algorithms on the island and an exchange of good solutions between the islands. In this work, we conduct rigorous run time analyses for such island models trying to simultaneously obtain good run times and low communication effort. We improve the existing upper bounds for both measures (i) by improving the run time bounds via a careful analysis, (ii) by balancing individual computation and communication in a more appropriate manner, and (iii) by replacing the usual communicate-with-all approach with randomized rumor spreading. In the latter, each island contacts a randomly chosen neighbor. This epidemic communication paradigm is known to lead to very fast and robust information dissemination in many applications. Our results concern island models running simple (1+1) evolutionary algorithms to optimize the classic test functions OneMax and LeadingOnes. We investigate binary trees, d-dimensional tori, and complete graphs as communication topologies.
Kötzing, Timo; Schirneck, Martin; Seidel, KarenNormal Forms in Semantic Language Identification. Algorithmic Learning Theory (ALT) 2017: 493-516
We consider language learning in the limit from text where all learning restrictions are semantic, that is, where any conjecture may be replaced by a semantically equivalent conjecture. For different such learning criteria, starting with the well-known TxtGBc-learning, we consider three different normal forms: strongly locking learning, consistent learning and (partially) set-driven learning. These normal forms support and simplify proofs and give insight into what behaviors are necessary for successful learning (for example when consistency in conservative learning implies cautiousness and strong decisiveness). We show that strongly locking learning can be assumed for partially set-driven learners, even when learning restrictions apply. We give a very general proof relying only on a natural property of the learning restriction, namely, allowing for simulation on equivalent text. Furthermore, when no restrictions apply, also the converse is true: every strongly locking learner can be made partially set-driven. For several semantic learning criteria we show that learning can be done consistently. Finally, we deduce for which learning restrictions partial set-drivenness and set-drivenness coincide, including a general statement about classes of infinite languages. The latter again relies on a simulation argument.
Shi, Feng; Schirneck, Martin; Friedrich, Tobias; Kötzing, Timo; Neumann, FrankReoptimization Times of Evolutionary Algorithms on Linear Functions Under Dynamic Uniform Constraints. Genetic and Evolutionary Computation Conference (GECCO) 2017: 1407-1414
Thee investigations of linear pseudo-Boolean functions play a central role in the area of runtime analysis of evolutionary computing techniques. Having an additional linear constraint on a linear function is equivalent to the NP-hard knapsack problem and special problem classes thereof have been investigated in recent works. In this paper, we extend these studies to problems with dynamic constraints and investigate the runtime of different evolutionary algorithms to recompute an optimal solution when the constraint bound changes by a certain amount. We study the classical \((1+1)\) EA and population-based algorithms and show that they recompute an optimal solution very efficiently. Furthermore, we show that a variant of the \((1+(\lambda, \lambda))\) GA can recompute the optimal solution more efficiently in some cases.
Doerr, Benjamin; Fischbeck, Philipp; Frahnow, Clemens; Friedrich, Tobias; Kötzing, Timo; Schirneck, MartinIsland Models Meet Rumor Spreading. Genetic and Evolutionary Computation Conference (GECCO) 2017: 1359-1366
Island models in evolutionary computation solve problems by a careful interplay of independently running evolutionary algorithms on the island and an exchange of good solutions between the islands. In this work, we conduct rigorous run time analyses for such island models trying to simultaneously obtain good run times and low communication effort. We improve the existing upper bounds for the communication effort (i) by improving the run time bounds via a careful analysis, (ii) by setting the balance between individual computation and communication in a more appropriate manner, and (iii) by replacing the usual communicate-with-all-neighbors approach with randomized rumor spreading, where each island contacts a randomly chosen neighbor. This epidemic communication paradigm is known to lead to very fast and robust information dissemination in many applications. Our results concern islands running simple (1+1) evolutionary algorithms, we regard d-dimensional tori and complete graphs as communication topologies, and optimize the classic test functions OneMax and LeadingOnes.
Friedrich, Tobias; Kötzing, Timo; Lagodzinski, J. A. Gregor; Neumann, Frank; Schirneck, MartinAnalysis of the (1+1) EA on Subclasses of Linear Functions under Uniform and Linear Constraints. Foundations of Genetic Algorithms (FOGA) 2017: 45-54
Linear functions have gained a lot of attention in the area of run time analysis of evolutionary computation methods and the corresponding analyses have provided many effective tools for analyzing more complex problems. In this paper, we consider the behavior of the classical (1+1) Evolutionary Algorithm for linear functions under linear constraint. We show tight bounds in the case where both the objective function and the constraint is given by the OneMax function and present upper bounds as well as lower bounds for the general case. Furthermore, we also consider the LeadingOnes fitness function.
Bläsius, Thomas; Friedrich, Tobias; Schirneck, MartinThe Parameterized Complexity of Dependency Detection in Relational Databases. International Symposium on Parameterized and Exact Computation (IPEC) 2016: 6:1-6:13
We study the parameterized complexity of classical problems that arise in the profiling of relational data. Namely, we characterize the complexity of detecting unique column combinations (candidate keys), functional dependencies, and inclusion dependencies with the solution size as parameter. While the discovery of uniques and functional dependencies, respectively, turns out to be W[2]-complete, the detection of inclusion dependencies is one of the first natural problems proven to be complete for the class W[3]. As a side effect, our reductions give insights into the complexity of enumerating all minimal unique column combinations or functional dependencies.
Friedrich, Tobias; Kötzing, Timo; Krejca, Martin S.; Nallaperuma, Samadhi; Neumann, Frank; Schirneck, MartinFast Building Block Assembly by Majority Vote Crossover. Genetic and Evolutionary Computation Conference (GECCO) 2016: 661-668
Different works have shown how crossover can help with building block assembly. Typically, crossover might get lucky to select good building blocks from each parent, but these lucky choices are usually rare. In this work we consider a crossover operator which works on three parent individuals. In each component, the offspring inherits the value present in the majority of the parents; thus, we call this crossover operator majority vote. We show that, if good components are sufficiently prevalent in the individuals, majority vote creates an optimal individual with high probability. Furthermore, we show that this process can be amplified: as long as components are good independently and with probability at least \(1/2+\delta\), we require only \(O(\log 1/\delta + \log \log n)\) successive stages of majority vote to create an optimal individual with high probability! We show how this applies in two scenarios. The first scenario is the Jump test function. With sufficient diversity, we get an optimization time of \(O(n \log n)\) even for jump sizes as large as \(O(n^{(1/2-\epsilon)})\). Our second scenario is a family of vertex cover instances. Majority vote optimizes this family efficiently, while local searches fail and only highly specialized two-parent crossovers are successful.
Kötzing, Timo; Schirneck, MartinTowards an Atlas of Computational Learning Theory. Symposium on Theoretical Aspects of Computer Science (STACS) 2016: 47:1-47:13
A major part of our knowledge about Computational Learning stems from comparisons of the learning power of different learning criteria. These comparisons inform about trade-offs between learning restrictions and, more generally, learning settings; furthermore, they inform about what restrictions can be observed without losing learning power. With this paper we propose that one main focus of future research in Computational Learning should be on a structured approach to determine the relations of different learning criteria. In particular, we propose that, for small sets of learning criteria, all pairwise relations should be determined; these relations can then be easily depicted as a map, a diagram detailing the relations. Once we have maps for many relevant sets of learning criteria, the collection of these maps is an Atlas of Computational Learning Theory, informing at a glance about the landscape of computational learning just as a geographical atlas informs about the earth. In this paper we work toward this goal by providing three example maps, one pertaining to partially set-driven learning, and two pertaining to strongly monotone learning. These maps can serve as blueprints for future maps of similar base structure.