Issac, Davis; Bhattacharya, Anup; Kumar, Amit; Jaiswal, Ragesh Sampling in space restricted settingsAlgorithmica 2018: 1439–1458
Space efficient algorithms play an important role in dealing with large amount of data. In such settings, one would like to analyze the large data using small amount of “working space”. One of the key steps in many algorithms for analyzing large data is to maintain a (or a small number) random sample from the data points. In this paper, we consider two space restricted settings-(i) the streaming model, where data arrives over time and one can use only a small amount of storage, and (ii) the query model, where we can structure the data in low space and answer sampling queries. In this paper, we prove the following results in the above two settings: \begin{itemize item In the streaming setting, we would like to maintain a random sample from the elements seen so far. We prove that one can maintain a random sample using \( \mathcal{O(\log n) \) random bits and \( \mathcal{O(\log n) \) bits of space, where n is the number of elements seen so far. We can extend this to the case when elements have weights as well. item In the query model, there are n elements with weights \(w_1, \dots, w_n \) (which are w-bit integers) and one would like to sample a random element with probability proportional to its weight. Bringmann and Larsen (STOC 2013) showed how to sample such an element using \(n \omega + 1\) bits of space (whereas, the information theoretic lower bound is \(n \omega \)). We consider the approximate sampling problem, where we are given an error parameter \( \varepsilon \), and the sampling probability of an element can be off by an \( \varepsilon \) factor. We give matching upper and lower bounds for this problem. \end{itemize
Arulselvan, Ashwin; Cseh, Ágnes; Groß, Martin; Manlove, David F.; Matuschke, Jannik Matchings with Lower Quotas: Algorithms and ComplexityAlgorithmica 2018: 185–208
We study a natural generalization of the maximum weight many-to-one matching problem. We are given an undirected bipartite graph \(G = (A \cup P, E)\) with weights on the edges in E, and with lower and upper quotas on the vertices in P. We seek a maximum weight many-to-one matching satisfying two sets of constraints: vertices in A are incident to at most one matching edge, while vertices in \(P\) are either unmatched or they are incident to a number of matching edges between their lower and upper quota. This problem, which we call maximum weight many-to-one matching with lower and upper quotas (WMLQ), has applications to the assignment of students to projects within university courses, where there are constraints on the minimum and maximum numbers of students that must be assigned to each project. In this paper, we provide a comprehensive analysis of the complexity of WMLQ from the viewpoints of classical polynomial time algorithms, fixed-parameter tractability, as well as approximability. We draw the line between NP-hard and polynomially tractable instances in terms of degree and quota constraints and provide efficient algorithms to solve the tractable ones. We further show that the problem can be solved in polynomial time for instances with bounded treewidth; however, the corresponding runtime is exponential in the treewidth with the maximum upper quota \(u_{\max}\) as basis, and we prove that this dependence is necessary unless FPT=W[1]. The approximability of WMLQ is also discussed: we present an approximation algorithm for the general case with performance guarantee \(u_{\max}+1\), which is asymptotically best possible unless \(P=NP\). Finally, we elaborate on how most of our positive results carry over to matchings in arbitrary graphs with lower quotas.
Bläsius, Thomas; Karrer, Annette; Rutter, Ignaz Simultaneous Embedding: Edge Orderings, Relative Positions, CutverticesAlgorithmica 2018: 1214–1277
A simultaneous embedding (with fixed edges) of two graphs \(G^1\) and \(G^2\) with common graph \(G = G^1 ∩ G^2\) is a pair of planar drawings of \(G^1\) and \(G^2\) that coincide on \(G\). It is an open question whether there is a polynomial-time algorithm that decides whether two graphs admit a simultaneous embedding (problem SEFE). In this paper, we present two results. First, a set of three linear-time preprocessing algorithms that remove certain substructures from a given SEFE instance, producing a set of equivalent Sefe instances without such substructures. The structures we can remove are (1) cutvertices of the union graph \(G^∪ = G^1 ∪ G^2\) , (2) most separating pairs of \(G^∪\), and (3) connected components of G that are biconnected but not a cycle. Second, we give an \(O(n³)\)-time algorithm solving Sefe for instances with the following restriction. Let u be a pole of a P-node \(µ\) in the SPQR-tree of a block of \(G^1\) or \(G^2\). Then at most three virtual edges of \(µ\) may contain common edges incident to u. All algorithms extend to the sunflower case, i.e., to the case of more than two graphs pairwise intersecting in the same common graph.
Kötzing, Timo; Sudholt, Dirk Preface to the Special Issue on Theory of Genetic and Evolutionary ComputationAlgorithmica 2018: 1575–1578
Evolutionary algorithms (EAs) are randomized search heuristics that can be employed to solve complex optimization problems, including multimodal or highly constrained problems. EAs work by mimicking principles from natural evolution: maintaining a collection of possible solutions (a population) and iteratively creating variants of the individuals (the offspring) and then choosing a new set of individuals for the next iteration (selection). EAs are popular because they represent general-purpose optimizers that can be easily applied to various problems, even in cases where little or no in-depth knowledge about the problem is available. In order to guide practitioners devising new and effective algorithms, theoretical computer scientists employ methods from the field of randomized algorithms to analyze the working principles of EAs with mathematical rigor. Key questions concern the impact of parameter choices (such as, for example, the offspring size or the choice of variation operators) as well as foundational work on developing powerful analysis methods. The theory track of the annual ACM Genetic and Evolutionary Computation Conference (GECCO) is the first tier event for advances in this direction. In this special issue six selected papers from the 2016 edition of the GECCO theory track are collected, each one of them carefully revised and extended to meet the high quality standards of Algorithmica.
Doerr, Benjamin; Doerr, Carola; Kötzing, Timo Static and Self-Adjusting Mutation Strengths for Multi-valued Decision VariablesAlgorithmica 2018: 1732–1768
The most common representation in evolutionary computation are bit strings. With very little theoretical work existing on how to use evolutionary algorithms for decision variables taking more than two values, we study the run time of simple evolutionary algorithms on some OneMax-like functions defined over \(\Omega={0,1,\dots,r−1}^n\). We observe a crucial difference in how we extend the one-bit-flip and standard-bit mutation operators to the multi-valued domain. While it is natural to modify a random position of the string or select each position of the solution vector for modification independently with probability \(1/n\), there are various ways to then change such a position. If we change each selected position to a random value different from the original one, we obtain an expected run time of \(\Theta(n r \log n)\). If we change each selected position by \(+1\) or \(−1\) (random choice), the optimization time reduces to \(\Theta(n r + n \log n)\). If we use a random mutation strength \(i \in {0,1,\dots,r−1}\) with probability inversely proportional to \(i\) and change the selected position by \(+i\) or \(−i\) (random choice), then the optimization time becomes \(\Theta(n \log(r)(\log n + \log r))\), which is asymptotically faster than the previous if \(r = \omega(\log(n)\log\log(n))\). Interestingly, a better expected performance can be achieved with a self-adjusting mutation strength that is based on the success of previous iterations. For the mutation operator that modifies a randomly chosen position, we show that the self-adjusting mutation strength yields an expected optimization time of \(\Theta(n(\log n + \log r))\), which is best possible among all dynamic mutation strengths. In our proofs, we use a new multiplicative drift theorem for computing lower bounds, which is not restricted to processes that move only towards the target.
Bläsius, Thomas; Friedrich, Tobias; Krohmer, Anton Cliques in Hyperbolic Random GraphsAlgorithmica 2018: 2324–2344
Most complex real world networks display scale-free features. This characteristic motivated the study of numerous random graph models with a power-law degree distribution. There is, however, no established and simple model which also has a high clustering of vertices as typically observed in real data. Hyperbolic random graphs bridge this gap. This natural model has recently been introduced by Krioukov et al. and has shown theoretically and empirically to fulfill all typical properties of real world networks, including power-law degree distribution and high clustering. We study cliques in hyperbolic random graphs \(G\) and present new results on the expected number of \(k\)-cliques \(E[K_k]\) and the size of the largest clique \(\omega(G)\). We observe that there is a phase transition at power-law exponent \(\beta = 3\). More precisely, for \(\beta\)\(\in\)\((2,3)\) we prove \(E[K_k] = \) \(n^{k(3-\beta)/2} \Theta(k)^{-k}\) and \(\omega(G) = \) \(\Theta\)\((n^{(3-\beta)/2})\), while for \(\beta \geq 3\) we prove \(E[K_k]=n \Theta(k)^{-k}\) and \(\omega(G)=\Theta(\log(n)/ \log\log n)\). Furthermore, we show that for \(\beta \geq 3\), cliques in hyperbolic random graphs can be computed in time \(O(n)\). If the underlying geometry is known, cliques can be found with worst-case runtime \(O(m n^{2.5})\) for all values of \(\beta\).
Abu-Khzam, Faisal N.; Bazgan, Cristina; Casel, Katrin; Fernau, Henning Clustering with Lower-Bounded Sizes - A General Graph-Theoretic FrameworkAlgorithmica 2018: 2517–2550
Classical clustering problems search for a partition of objects into a fixed number of clusters. In many scenarios, however, the number of clusters is not known or necessarily fixed. Further, clusters are sometimes only considered to be of significance if they have a certain size. We discuss clustering into sets of minimum cardinality \(k\) without a fixed number of sets and present a general model for these types of problems. This general framework allows the comparison of different measures to assess the quality of a clustering. We specifically consider nine quality-measures and classify the complexity of the resulting problems with respect to \(k\). Further, we derive some polynomial-time solvable cases for \(k=2\) with connections to matching-type problems which, among other graph problems, then are used to compute approximations for larger values of \(k\).
Bringmann, Karl; Friedrich, Tobias; Krohmer, Anton De-anonymization of Heterogeneous Random Graphs in Quasilinear TimeAlgorithmica 2018: 3397–3427
There are hundreds of online social networks with altogether billions of users. Many such networks publicly release structural information, with all personal information removed. Empirical studies have shown, however, that this provides a false sense of privacy - it is possible to identify almost all users that appear in two such anonymized network as long as a few initial mappings are known. We analyze this problem theoretically by reconciling two versions of an artificial power-law network arising from independent subsampling of vertices and edges. We present a new algorithm that identifies most vertices and makes no wrong identifications with high probability. The number of vertices matched is shown to be asymptotically optimal. For an n-vertex graph, our algorithm uses \(n^\varepsilon\) seed nodes (for an arbitrarily small \(\varepsilon\)) and runs in quasilinear time. This improves previous theoretical results which need \(\Theta(n)\) seed nodes and have runtimes of order \(n^{1+\Omega(1)}\). Additionally, the applicability of our algorithm is studied experimentally on different networks.
Bläsius, Thomas; Stumpf, Peter; Ueckerdt, Torsten Local and Union BoxicityDiscrete Mathematics 2018: 1307–1315
The boxicity \(box(H)\) of a graph \(H\) is the smallest integer \(d\) such that \(H\) is the intersection of \(d\) interval graphs, or equivalently, that \(H\) is the intersection graph of axis-aligned boxes in \(R^d\). These intersection representations can be interpreted as covering representations of the complement \(H^c\) of \(H\) with co-interval graphs, that is, complements of interval graphs. We follow the recent framework of global, local and folded covering numbers (Knauer and Ueckerdt, 2016) to define two new parameters: the local boxicity \(box_l(H)\) and the union boxicity \(box_u(H)\) of \(H\). The union boxicity of \(H\) is the smallest \(d\) such that \(H^c\) can be covered with \(d\) vertex-disjoint unions of co-interval graphs, while the local boxicity of \(H\) is the smallest \(d\) such that \(H^c\) can be covered with co-interval graphs, at most \(d\) at every vertex. We show that for every graph \(H\) we have \(box_l(H) leq box_u(H) leq box(H) \) and that each of these inequalities can be arbitrarily far apart. Moreover, we show that local and union boxicity are also characterized by intersection representations of appropriate axis-aligned boxes in \(R^d\) . We demonstrate with a few striking examples, that in a sense, the local boxicity is a better indication for the complexity of a graph, than the classical boxicity.
Akbari, Saieed; Maimani, Hamid Reza; Majd, Leila Parsaei On the spectrum of some signed complete and complete bipartite graphsFilomat 2018: 5817–5826
In this paper, we obtain the spectrum of signed complete and complete bipartite graphs whose negative edges form a matching. Moreover, we construct a family of signed complete graphs having symmetric spectrum.
Dang, Duc-Cuong; Friedrich, Tobias; Kötzing, Timo; Krejca, Martin S.; Lehre, Per Kristian; Oliveto, Pietro S.; Sudholt, Dirk; Sutton, Andrew M. Escaping Local Optima Using Crossover with Emergent DiversityIEEE Transactions on Evolutionary Computation 2018: 484–497
Population diversity is essential for the effective use of any crossover operator. We compare seven commonly used diversity mechanisms and prove rigorous run time bounds for the \((\mu+1)\) GA using uniform crossover on the fitness function \(Jump_k\). All previous results in this context only hold for unrealistically low crossover probability \(p_c=O(k/n)\), while we give analyses for the setting of constant \(p_c < 1\) in all but one case. Our bounds show a dependence on the problem size \(n\), the jump length \(k\), the population size \(\mu\), and the crossover probability \(p_c\). For the typical case of constant \(k > 2\) and constant \(p_c\), we can compare the resulting expected optimisation times for different diversity mechanisms assuming an optimal choice of \(\mu\): \(O(n^{k-1})\) for duplicate elimination/minimisation, \(O(n^2 \log n)\) for maximising the convex hull, \(O(n \log n)\) for det. crowding (assuming \(p_c = k/n\)), \(O(n \log n)\) for maximising the Hamming distance, \(O(n \log n)\) for fitness sharing, \(O(n \log n)\) for the single-receiver island model. This proves a sizeable advantage of all variants of the \((\mu+1)\) GA compared to the (1+1) EA, which requires \(\Theta(n^k)\). In a short empirical study we confirm that the asymptotic differences can also be observed experimentally.
Bläsius, Thomas; Friedrich, Tobias; Krohmer, Anton; Laue, Sören Efficient Embedding of Scale-Free Graphs in the Hyperbolic PlaneIEEE/ACM Transactions on Networking 2018: 920–933
Hyperbolic geometry appears to be intrinsic in many large real networks. We construct and implement a new maximum likelihood estimation algorithm that embeds scale-free graphs in the hyperbolic space. All previous approaches of similar embedding algorithms require at least a quadratic runtime. Our algorithm achieves quasilinear runtime, which makes it the first algorithm that can embed networks with hundreds of thousands of nodes in less than one hour. We demonstrate the performance of our algorithm on artificial and real networks. In all typical metrics, like Log-likelihood and greedy routing, our algorithm discovers embeddings that are very close to the ground truth.
Baum, Moritz; Bläsius, Thomas; Gemsa, Andreas; Rutter, Ignaz; Wegner, Franziska Scalable Exact Visualization of Isocontours in Road Networks via Minimum-Link PathsJournal of Computational Geometry 2018: 27–73
Isocontours in road networks represent the area that is reachable from a source within a given resource limit. We study the problem of computing accurate isocontours in realistic, large-scale networks. We propose isocontours represented by polygons with minimum number of segments that separate reachable and unreachable components of the network. Since the resulting problem is not known to be solvable in polynomial time, we introduce several heuristics that run in (almost) linear time and are simple enough to be implemented in practice. A key ingredient is a new practical linear-time algorithm for minimum-link paths in simple polygons. Experiments in a challenging realistic setting show excellent performance of our algorithms in practice, computing near-optimal solutions in a few milliseconds on average, even for long ranges.
Rizzo, Manuel; Battaglia, Francesco Statistical and Computational Tradeoff in Genetic Algorithm-Based EstimationJournal of Statistical Computation and Simulation 2018: 3081–3097
When a genetic algorithm (GA) is employed in a statistical problem, the result is affected by both variability due to sampling and the stochastic elements of algorithm. Both of these components should be controlled in order to obtain reliable results. In the present work we analyze parametric estimation problems tackled by GAs, and pursue two objectives: the first one is related to a formal variability analysis of final estimates, showing that it can be easily decomposed in the two sources of variability. In the second one we introduce a framework of GA estimation with fixed computational resources, which is a form of statistical and the computational tradeoff question, crucial in recent problems. In this situation the result should be optimal from both the statistical and computational point of view, considering the two sources of variability and the constraints on resources. Simulation studies will be presented for illustrating the proposed method and the statistical and computational tradeoff question.
Akbari, Saieed; Haemers, Willem H; Maimani, Hamid Reza; Majd, Leila Parsaei Signed graphs cospectral with the pathLinear Algebra and its Applications 2018: 104–116
A signed graph \(\Gamma\) is said to be determined by its spectrum if every signed graph with the same spectrum as \(\Gamma\) is switching isomorphic with \(\Gamma\). Here it is proved that the path \(P_n\), interpreted as a signed graph, is determined by its spectrum if and only if \( n equiv 0, 1, or 2 (\text{mod 4)\), unless \(n \in \8, 13, 14, 17, 29\ \), or \(n = 3\).
Cseh, Ágnes; Kavitha, Telikepalli Popular edges and dominant matchingsMathematical Programming 2018: 209–229
Given a bipartite graph \(G=(A \cup B, E)\) with strict preference lists and given an edge \(e^* \in E\), we ask if there exists a popular matching in \(G\) that contains \(e^*\). We call this the popular edge problem. A matching \(M\) is popular if there is no matching \(M'\) such that the vertices that prefer \(M'\) to \(M\) outnumber those that prefer \(M\) to \(M'\). It is known that every stable matching is popular; however \(G\) may have no stable matching with the edge \(e^*\). In this paper we identify another natural subclass of popular matchings called "dominant matchings" and show that if there is a popular matching that contains the edge \(e^*\), then there is either a stable matching that contains \(e^*\) or a dominant matching that contains \(e^*\). This allows us to design a linear time algorithm for identifying the set of popular edges. When preference lists are complete, we show an \(\mathcal{O(n^3)\) algorithm to find a popular matching containing a given set of edges or report that none exists, where \(n=|A|+|B|\).
Azar, Yossi; Cohen, Sarel An improved algorithm for online machine minimizationOperations Research Letters 2018: 128–133
The online machine minimization problem seeks to design a preemptive scheduling algorithm on multiple machines — each job \(j\) arrives at its release time \(r_j\), has to be processed for \(p_j\) time units, and must be completed by its deadline \(d_j\). The goal is to minimize the number of machines the algorithm uses. We improve the \(\mathcal{O(\log m)\)-competitive algorithm by Chen, Megow and Schewior (SODA 2016) and provide an \(\mathcal{O(\frac{\log m\log \log m})\)-competitive algorithm.
Friedrich, Tobias; Krohmer, Anton On the diameter of hyperbolic random graphsSIAM Journal on Discrete Mathematics 2018: 1314–1334
Large real-world networks are typically scale-free. Recent research has shown that such graphs are described best in a geometric space. More precisely, the internet can be mapped to a hyperbolic space such that geometric greedy routing is close to optimal (Boguñá, Papadopoulos, and Krioukov. Nature Communications, 1:62, 2010). This observation has pushed the interest in hyperbolic networks as a natural model for scale-free networks. Hyperbolic random graphs follow a power law degree distribution with controllable exponent \(\beta\) and show high clustering (Gugelmann, Panagiotou, and Peter. ICALP, pp. 573–585, 2012). For understanding the structure of the resulting graphs and for analyzing the behavior of network algorithms, the next question is bounding the size of the diameter. The only known explicit bound is \(O(\)\((\log n)\)\(^{32/((3 - \beta)(5 - \beta))+1})\)(Kiwi and Mitsche. ANALCO, pp. 26–39, 2015). We present two much simpler proofs for an improved upper bound of \(O((\log n)\)\(^{2/(3 - \beta)})\) and a lower bound of \(\Omega(\log n)\). If \(\beta > 3\), we show that the latter bound is tight by proving an upper bound of \(O(\log n)\) for the diameter.
Friedrich, Tobias; Katzmann, Maximilian; Krohmer, Anton Unbounded Discrepancy of Deterministic Random Walks on GridsSIAM Journal on Discrete Mathematics 2018: 2441–2452
Random walks are frequently used in randomized algorithms. We study a derandomized variant of a random walk on graphs, called rotor-router model. In this model, instead of distributing tokens randomly, each vertex serves its neighbors in a fixed deterministic order. For most setups, both processes behave remarkably similar: Starting with the same initial configuration, the number of tokens in the rotor-router model deviates only slightly from the expected number of tokens on the corresponding vertex in the random walk model. The maximal difference over all vertices and all times is called single vertex discrepancy. Cooper and Spencer (2006) showed that on \(\mathbb{Z}^{d}\) the single vertex discrepancy is only a constant \(c_d\). Other authors also determined the precise value of \(c_d\) for \(d=1,2\). All these results, however, assume that initially all tokens are only placed on one partition of the bipartite graph \(\mathbb{Z}^{d}\). We show that this assumption is crucial by proving that otherwise the single vertex discrepancy can become arbitrarily large. For all dimensions \(d\geq1\) and arbitrary discrepancies~\(\ell \geq 0\), we construct configurations that reach a discrepancy of at least \(\ell\).
Bazgan, Cristina; Brankovic, Ljiljana; Casel, Katrin; Fernau, Henning; Jansen, Klaus; Klein, Kim-Manuel; Lampis, Michael; Liedloff, Mathieu; Monnot, Jérôme; Paschos, Vangelis Th. The many facets of upper dominationTheoretical Computer Science 2018: 2–25
This paper studies Upper Domination, i.e., the problem of computing the maximum cardinality of a minimal dominating set in a graph with respect to classical and parameterised complexity as well as approximability.
Doskoč, Vanja Confident Iterative Learning in Computational Learning TheoryCurrent Trends in Theory and Practice of Computer Science (SOFSEM) 2018: 30–42
In inductive inference various types of learning have emerged. The main aim of this paper is to investigate a new type of learning, the confident iterative learning. Given a class to be learnt, the idea here is to merge the following two concepts. For confidence, we require the learner to converge on any set, however, it only needs to be correct on the sets in the class. To be iterative, we restrict the learner’s memory on previ- ous inputs and calculations to its last hypothesis. Investigating the new learner, we will provide negative and positive examples, as well as some properties the confident iterative learner possesses. This will peak at a classification theorem for certain types of classes. Next, we will introduce and compare different types of confidence, focus- ing on the learner’s behaviour on sets outside of the class. Lastly, we will focus on the possible hypotheses. Introducing learning with respect to hypothesis spaces, we will provide examples witnessing that exact, class preserving and class comprising learning are different.
Battaglia, Francesco; Cucina, Domenico; Rizzo, Manuel Periodic Autoregressive Models with Multiple Structural Changes by Genetic AlgorithmsMathematical and Statistical Methods for Actuarial Sciences and Finance (MAF) 2018: 107–110
We present a model and a computational procedure for dealing with seasonality and regime changes in time series. In this work we are interested in time series which in addition to trend display seasonality in mean, in autocorrelation and in variance. These type of series appears in many areas, including hydrology, meteorology, economics and finance. The seasonality is accounted for by subset PAR modelling, for which each season follows a possibly different Autoregressive model. Levels, trend, autoregressive parameters and residual variances are allowed to change their values at fixed unknown times. The identification of number and location of structural changes, as well as PAR lags indicators, is based on Genetic Algorithms, which are suitable because of high dimensionality of the discrete search space. An application to Italian industrial production index time series is also proposed.
Bläsius, Thomas; Friedrich, Tobias; Katzmann, Maximilian; Krohmer, Anton Hyperbolic Embeddings for Near-Optimal Greedy RoutingAlgorithm Engineering and Experiments (ALENEX) 2018: 199–208
Greedy routing computes paths between nodes in a network by successively moving to the neighbor closest to the target with respect to coordinates given by an embedding into some metric space. Its advantage is that only local information is used for routing decisions. We present different algorithms for generating graph embeddings into the hyperbolic plane that are well suited for greedy routing. In particular our embeddings guarantee that greedy routing always succeeds in reaching the target and we try to minimize the lengths of the resulting greedy paths. We evaluate our algorithm on multiple generated and real wold networks. For networks that are generally assumed to have a hidden underlying hyperbolic geometry, such as the Internet graph, we achieve near-optimal results, i.e., the resulting greedy paths are only slightly longer than the corresponding shortest paths. In the case of the Internet graph, they are only \(6\%\) longer when using our best algorithm, which greatly improves upon the previous best known embedding, whose creation required substantial manual intervention.
Kumar Shekar, Arvind; Pappik, Marcus; Iglesias Sánchez, Patricia; Müller, Emmanuel Selection of Relevant and Non-Redundant Multivariate Ordinal Patterns for Time Series ClassificationDiscovery Science (DS) 2018: 224–240
Transformation of multivariate time series into feature spaces are common for data mining tasks like classification. Ordinality is one important property in time series that provides a qualitative representation of the underlying dynamic regime. In a multivariate time series, ordinalities from multiple dimensions combine together to be discriminative for the classification problem. However, existing works on ordinality do not address the multivariate nature of the time series. For multivariate ordinal patterns, there is a computational challenge with an explosion of pattern combinations, while not all patterns are relevant and provide novel information for the classification. In this work, we propose a technique for the extraction and selection of relevant and non-redundant multivariate ordinal patterns from the high-dimensional combinatorial search space. Our proposed approach Ordinal feature extraction (ordex), simultaneously extracts and scores the relevance and redundancy of ordinal patterns without training a classifier. As a filter-based approach, ordex aims to select a set of relevant patterns with complementary information. Hence, using our scoring function based on the principles of Chebyshev’s inequality, we maximize the relevance of the patterns and minimize the correlation between them. Our experiments on real world datasets show that ordinality in time series contains valuable information for classification in several applications.
Cseh, Ágnes; Kavitha, Telikepalli Popular Matchings in Complete GraphsFoundations of Software Technology and Theoretical Computer Science (FSTTCS) 2018: 17:1–17:14
Our input is a complete graph \(G = (V,E)\) on n vertices where each vertex has a strict ranking of all other vertices in \(G\). The goal is to construct a matching in \(G\) that is "globally stable" or popular. A matching \(M\) is popular if \(M\) does not lose a head-to-head election against any matching \(M'\): here each vertex casts a vote for the matching in \(\{M,M'\}\) where it gets a better assignment. Popular matchings need not exist in the given instance \(G\) and the popular matching problem is to decide whether one exists or not. The popular matching problem in \(G\) is easy to solve for odd \(n\). Surprisingly, the problem becomes NP-hard for even n, as we show here.
Friedrich, Tobias; Quinzan, Francesco; Wagner, Markus Escaping Large Deceptive Basins of Attraction with Heavy Mutation OperatorsGenetic and Evolutionary Computation Conference (GECCO) 2018: 293–300
In many Evolutionary Algorithms (EAs), a parameter that needs to be tuned is that of the mutation rate, which determines the probability for each decision variable to be mutated. Typically, this rate is set to 1/n for the duration of the optimization, where n is the number of decision variables. This setting has the appeal that the expected number of mutated variables per iteration is one. In a recent theoretical study, it was proposed to sample the number of mutated variables from a power-law distribution. This results into a significantly higher probability on larger numbers of mutations, so that escaping local optima becomes more probable. In this paper, we propose another class of non-uniform mutation rates. We study the benefits of this operator in terms of average-case black-box complexity analysis and experimental comparison. We consider both pseudo-Boolean artificial landscapes and combinatorial problems (the Minimum Vertex Cover and the Maximum Cut). We observe that our non-uniform mutation rates significantly outperform the standard choices, when dealing with landscapes that exhibit large deceptive basins of attraction.
Friedrich, Tobias; Kötzing, Timo; Quinzan, Francesco; Sutton, Andrew M. Improving the Run Time of the (1+1) Evolutionary Algorithm with Luby SequencesGenetic and Evolutionary Computation Conference (GECCO) 2018: 301–308
In the context of black box optimization, the most common way to handle deceptive attractors is to periodically restart the algorithm. In this paper, we explore the benefits of combining the simple \((1+1)\) Evolutionary Algorithm (EA) with the Luby Universal Strategy - the \((1+1)~EA_{\mathcal{U}}\), a meta-heuristic that does not require parameter tuning. We first consider two artificial pseudo-Boolean landscapes, on which the \((1+1)~EA\) exhibits exponential run time. We prove that the \((1+1)~EA_{\mathcal{U}}\) has polynomial run time on both instances. We then consider the Minimum Vertex Cover on two classes of graphs. Again, the \((1+1)~EA\) yields exponential run time on those instances, and the \((1+1)~EA_{\mathcal{U}}\) finds the global optimum in polynomial time. We conclude by studying the Makespan Scheduling. We consider an instance on which the \((1+1)~EA\) does not find a \((4/3-\epsilon)\)-approximation in polynomial time, and we show that the \((1+1)~EA_{\mathcal{U}}\) reaches a \((4/3-\epsilon)\)-approximation in polynomial time. We then prove that the \((1+1)~EA_{\mathcal{U}}\) serves as an Efficient Polynomial-time Approximation Scheme (EPTAS) for the Partition Problem, for a \((1+\epsilon)\)-approximation with \(\epsilon > 4/n\).
Gao, Wanru; Friedrich, Tobias; Neumann, Frank; Hercher, Christian Randomized Greedy Algorithms for Covering ProblemsGenetic and Evolutionary Computation Conference (GECCO) 2018: 309–315
Greedy algorithms provide a fast and often also effective solution to many combinatorial optimization problems. However, it is well known that they sometimes lead to low quality solutions on certain instances. In this paper, we explore the use of randomness in greedy algorithms for the minimum vertex cover and dominating set problem and compare the resulting performance against their deterministic counterpart. Our algorithms are based on a parameter \(\gamma\) which allows to explore the spectrum between uniform and deterministic greedy selection in the steps of the algorithm and our theoretical and experimental investigations point out the benefits of incorporating randomness into greedy algorithms for the two considered combinatorial optimization problems.
Doerr, Benjamin; Krejca, Martin S. Significance-based Estimation-of-Distribution AlgorithmsGenetic and Evolutionary Computation Conference (GECCO) 2018: 1483–1490
Estimation-of-distribution algorithms (EDAs) are randomized search heuristics that maintain a stochastic model of the solution space. This model is updated from iteration to iteration based on the quality of the solutions sampled according to the model. As previous works show, this short-term perspective can lead to erratic updates of the model, in particular, to bit-frequencies approaching a random boundary value. This can lead to significant performance losses. In order to overcome this problem, we propose a new EDA that takes into account a longer history of samples and updates its model only with respect to information which it classifies as statistically significant. We prove that this significance-based compact genetic algorithm (sig-cGA) optimizes the common benchmark functions OneMax and LeadingOnes both in \(O(n\log n)\) time, a result shown for no other EDA or evolutionary algorithm so far. For the recently proposed scGA – an EDA that tries to prevent erratic model updates by imposing a bias to the uniformly distributed model – we prove that it optimizes OneMax only in a time exponential in the hypothetical population size \(1/\rho\).
Issac, Davis; Chandran, L. Sunil; Cheung, Yuen Kueng Spanning tree congestion and computation of gyori lovasz partitionInternational Colloquium on Automata, Languages, and Programming (ICALP) 2018: 1–14
We study a natural problem in graph sparsification, the Spanning Tree Congestion (STC) problem. Informally, it seeks a spanning tree with no tree-edge routing too many of the original edges. For any general connected graph with n vertices and m edges, we show that its STC is at most \( \mathcal{O(\sqrt{mn}) \), which is asymptotically optimal since we also demonstrate graphs with STC at least \( \Omega(\sqrt{mn}) \). We present a polynomial-time algorithm which computes a spanning tree with congestion \( \mathcal{O(\sqrt{mn}) \log n \) \( \mathcal{O(\sqrt{mn} \log n ) \). We also present another algorithm for computing a spanning tree with congestion \( \mathcal{O(\sqrt{mn}) \); this algorithm runs in sub-exponential time when \(m = \omega(n \log^2 n ) \). For achieving the above results, an important intermediate theorem is generalized Györi-Lovász theorem. Chen et al. [Jiangzhuo Chen et al., 2007] gave a non-constructive proof. We give the first elementary and constructive proof with a local search algorithm of running time \( \mathcal{O^*( 4^n ) \). We discuss some consequences of the theorem concerning graph partitioning, which might be of independent interest. We also show that for any graph which satisfies certain expanding properties, its STC is at most \( \mathcal{O^*(n) \), and a corresponding spanning tree can be computed in polynomial time. We then use this to show that a random graph has STC \( \Theta(n) \) with high probability.
Arar, Moab; Chechik, Shiri; Cohen, Sarel; Stein, Cliff; Wajc, David Dynamic Matching: Reducing Integral Algorithms to Approximately-Maximal Fractional AlgorithmsInternational Colloquium on Automata, Languages and Programming (ICALP) 2018: 7:1–7:16
We present a simple randomized reduction from fully-dynamic integral matching algorithms to fully-dynamic "approximately-maximal" fractional matching algorithms. Applying this reduction to the recent fractional matching algorithm of Bhattacharya, Henzinger, and Nanongkai (SODA 2017), we obtain a novel result for the integral problem. Specifically, our main result is a randomized fully-dynamic \((2+\varepsilon)\)-approximate integral matching algorithm with small polylog worst-case update time. For the \((2+\varepsilon)\)-approximation regime only a fractional fully-dynamic \((2+\varepsilon)\)-matching algorithm with worst-case polylog update time was previously known, due to Bhattacharya et al. (SODA 2017). Our algorithm is the first algorithm that maintains approximate matchings with worst-case update time better than polynomial, for any constant approximation ratio. As a consequence, we also obtain the first constant-approximate worst-case polylogarithmic update time maximum weight matching algorithm.
Bläsius, Thomas; Freiberger, Cedric; Friedrich, Tobias; Katzmann, Maximilian; Montenegro-Retana, Felix; Thieffry, Marianne Efficient Shortest Paths in Scale-Free Networks with Underlying Hyperbolic GeometryInternational Colloquium on Automata, Languages, and Programming (ICALP) 2018: 20:1–20:14
A common way to accelerate shortest path algorithms on graphs is the use of a bidirectional search, which simultaneously explores the graph from the start and the destination. It has been observed recently that this strategy performs particularly well on scale-free real-world networks. Such networks typically have a heterogeneous degree distribution (e.g., a power-law distribution) and high clustering (i.e., vertices with a common neighbor are likely to be connected themselves). These two properties can be obtained by assuming an underlying hyperbolic geometry. To explain the observed behavior of the bidirectional search, we analyze its running time on hyperbolic random graphs and prove that it is \(\tilde{O}(n\)\(^{2 - 1/ \alpha}+\)\(n^{1/(2\alpha)}\)\(+ \delta_{\max})\) with high probability, where \(\alpha\)\(\in\)\((0.5, 1)\) controls the power-law exponent of the degree distribution, and \(\delta_{\max}\) is the maximum degree. This bound is sublinear, improving the obvious worst-case linear bound. Although our analysis depends on the underlying geometry, the algorithm itself is oblivious to it.
Casel, Katrin Resolving Conflicts for Lower-Bounded ClusteringInternational Symposium on Parameterized and Exact Computation (IPEC) 2018: 23:1–23:14
This paper considers the effect of non-metric distances for lower-bounded clustering, i.e., the problem of computing a partition for a given set of objects with pairwise distance, such that each set has a certain minimum cardinality (as required for anonymisation or balanced facility location problems). We discuss lower-bounded clustering with the objective to minimise the maximum radius or diameter of the clusters. For these problems there exists a 2-approximation but only if the pairwise distance on the objects satisfies the triangle inequality, without this property no polynomial-time constant factor approximation is possible. We try to resolve or at least soften this effect of non-metric distances by devising particular strategies to deal with violations of the triangle inequality ("conflicts"). With parameterised algorithmics, we find that if the number of such conflicts is not too large, constant factor approximations can still be computed efficiently. In particular, we introduce parameterised approximations with respect to not just the number of conflicts but also for the vertex cover number of the "conflict graph" (graph induced by conflicts). Interestingly, we salvage the approximation ratio of 2 for diameter while for radius it is only possible to show a ratio of 3. For the parameter vertex cover number of the conflict graph this worsening in ratio is shown to be unavoidable, unless FPT=W[2]. We further discuss improvements for diameter by choosing the (induced) \(P_3\)-cover number of the conflict graph as parameter and complement these by showing that, unless FPT=W[1], there exists no constant factor parameterised approximation with respect to the parameter split vertex deletion set.
Cucina, Domenico; Rizzo, Manuel; Ursu, Eugen Identification of Multiregime Periodic Autoregressive Models by Genetic AlgorithmsInternational Conference on Time Series and Forecasting (ITISE) 2018: 396–407
This paper develops a procedure for identifying multiregime Periodic AutoRegressive (PAR) models. In each regime a possibly dif- ferent PAR model is built, for which changes can be due to the seasonal means, the autocorrelation structure or the variances. Number and lo- cations of changepoints which subdivide the time span are detected by means of Genetic Algorithms (GAs), that optimize an identification cri- terion. The method is evaluated by means of simulation studies, and is then employed to analyze shrimp fishery data.
Issac, Davis; van Leeuwen, Erik Jan; Das, Anita; Chandran, L. Sunil Algorithms and bounds for very strong rainbow coloringLatin American Symposium on Theoretical Informatics Conference (LATIN) 2018: 625–639
A well-studied coloring problem is to assign colors to the edges of a graph G so that, for every pair of vertices, all edges of at least one shortest path between them receive different colors. The minimum number of colors necessary in such a coloring is the strong rainbow connection number (\(\mathbf{src(G) \)) of the graph. When proving upper bounds on \(\mathbf{src(G) \), it is natural to prove that a coloring exists where, for every shortest path between every pair of vertices in the graph, all edges of the path receive different colors. Therefore, we introduce and formally define this more restricted edge coloring number, which we call very strong rainbow connection number (\(\mathbf{vsrc(G) \)). In this paper, we give upper bounds on \(\mathbf{vsrc(G) \) for several graph classes, some of which are tight. These immediately imply new upper bounds on \(\mathbf{src(G) \) for these classes, showing that the study of \(\mathbf{vsrc(G) \) enables meaningful progress on bounding \(\mathbf{src(G) \). Then we study the complexity of the problem to compute \(\mathbf{vsrc(G) \), particularly for graphs of bounded treewidth, and show this is an interesting problem in its own right. We prove that \(\mathbf{vsrc(G) \) can be computed in polynomial time on cactus graphs; in contrast, this question is still open for \(\mathbf{src(G) \). We also observe that deciding whether \(\mathbf{vsrc(G) = k \) is fixed-parameter tractable in k and the treewidth of G. Finally, on general graphs, we prove that there is no polynomial-time algorithm to decide whether \(\mathbf{vsrc(G) \leq 3 \) nor to approximate \(\mathbf{vsrc(G) \) within a factor \( n^1-\varepsilon \), unless P = NP.
Issac, Davis; van Leeuwen, Erik Jan; Lauri, Juho; Lima, Paloma; Heggernes, Pinar Rainbow Vertex Coloring Bipartite Graphs and Chordal GraphsMathematical Foundations of Computer Science (MFCS) 2018: 1–13
Given a graph with colors on its vertices, a path is called a rainbow vertex path if all its internal vertices have distinct colors. We say that the graph is rainbow vertex-connected if there is a rainbow vertex path between every pair of its vertices. We study the problem of deciding whether the vertices of a given graph can be colored with at most k colors so that the graph becomes rainbow vertex-connected. Although edge-colorings have been studied extensively under similar constraints, there are significantly fewer results on the vertex variant that we consider. In particular, its complexity on structured graph classes was explicitly posed as an open question. We show that the problem remains NP-complete even on bipartite apex graphs and on split graphs. The former can be seen as a first step in the direction of studying the complexity of rainbow coloring on sparse graphs, an open problem which has attracted attention but limited progress. We also give hardness of approximation results for both bipartite and split graphs. To complement the negative results, we show that bipartite permutation graphs, interval graphs, and block graphs can be rainbow vertex-connected optimally in polynomial time.
Göbel, Andreas; Lagodzinski, J. A. Gregor; Seidel, Karen Counting Homomorphisms to Trees Modulo a PrimeMathematical Foundations of Computer Science (MFCS) 2018: 49:1–49:13
Many important graph theoretic notions can be encoded as counting graph homomorphism problems, such as partition functions in statistical physics, in particular independent sets and colourings. In this article we study the complexity of~\($\#_p\textsc{HomsTo}H$\), the problem of counting graph homomorphisms from an input graph to a graph \($H$\) modulo a prime number~\($p$\). Dyer and Greenhill proved a dichotomy stating that the tractability of non-modular counting graph homomorphisms depends on the structure of the target graph. Many intractable cases in non-modular counting become tractable in modular counting due to the common phenomenon of cancellation. In subsequent studies on counting modulo~\($2$\), however, the influence of the structure of~\($H$\) on the tractability was shown to persist, which yields similar dichotomies. Our main result states that for every tree~\($H$\) and every prime~\($p$\) the problem \($\#_p\textsc{HomsTo}H$\) is either polynomial time computable or \($\#_p\mathsf{P}$\)-complete. This relates to the conjecture of Faben and Jerrum stating that this dichotomy holds for every graph \($H$\) when counting modulo~2. In contrast to previous results on modular counting, the tractable cases of \($\#_p\textsc{HomsTo}H$\) are essentially the same for all values of the modulo when \($H$\) is a tree. To prove this result, we study the structural properties of a homomorphism. As an important interim result, our study yields a dichotomy for the problem of counting weighted independent sets in a bipartite graph modulo some prime~\($p$\). These results are the first suggesting that such dichotomies hold not only for the one-bit functions of the modulo~2 case but also for the modular counting functions of all primes~\($p$\).
Kötzing, Timo; Lagodzinski, J. A. Gregor; Lengler, Johannes; Melnichenko, Anna Destructiveness of Lexicographic Parsimony Pressure and Alleviation by a Concatenation Crossover in Genetic ProgrammingParallel Problem Solving From Nature (PPSN) 2018: 42–54
For theoretical analyses there are two specifics distinguishing GP from many other areas of evolutionary computation. First, the variable size representations, in particular yielding a possible bloat (i.e. the growth of individuals with redundant parts). Second, the role and realization of crossover, which is particularly central in GP due to the tree-based representation. Whereas some theoretical work on GP has studied the effects of bloat, crossover had a surprisingly little share in this work. We analyze a simple crossover operator in combination with local search,where a preference for small solutions minimizes bloat (lexicographic parsimony pressure); the resulting algorithm is denoted ConcatenationCrossover GP. For this purpose three variants of the well-studied Majority test function with large plateaus are considered. We show that the Concatenation Crossover GP can efficiently optimize these test functions,while local search cannot be efficient for all three variants independent of employing bloat control.
Kötzing, Timo; Krejca, Martin S. First-Hitting Times for Finite State SpacesParallel Problem Solving From Nature (PPSN) 2018: 79–91
One of the most important aspects of a randomized algorithm is bounding its expected run time on various problems. Formally speaking, this means bounding the expected first-hitting time of a random process. The two arguably most popular tools to do so are the fitness level method and drift theory. The fitness level method considers arbitrary transition probabilities but only allows the process to move toward the goal. On the other hand, drift theory allows the process to move into any direction as long as it moves closer to the goal in expectation; however, this tendency has to be monotone and, thus, the transition probabilities cannot be arbitrary. We provide a result that combines the benefit of these two approaches: our result gives a lower and an upper bound for the expected first-hitting time of a random process over \(\{0, \ldots ,n\}\) that is allowed to move forward and backward by 1 and can use arbitrary transition probabilities. In case that the transition probabilities are known, our bounds coincide and yield the exact value of the expected first-hitting time. Further, we also state the stationary distribution as well as the mixing time of a special case of our scenario.
Kötzing, Timo; Krejca, Martin S. First-Hitting Times Under Additive DriftParallel Problem Solving From Nature (PPSN) 2018: 92–104
For the last ten years, almost every theoretical result concerning the expected run time of a randomized search heuristic used drift theory, making it the arguably most important tool in this domain. Its success is due to its ease of use and its powerful result: drift theory allows the user to derive bounds on the expected first-hitting time of a random process by bounding expected local changes of the process – the drift. This is usually far easier than bounding the expected first-hitting time directly. Due to the widespread use of drift theory, it is of utmost importance to have the best drift theorems possible. We improve the fundamental additive, multiplicative, and variable drift theorems by stating them in a form as general as possible and providing examples of why the restrictions we keep are still necessary. Our additive drift theorem for upper bounds only requires the process to be nonnegative, that is, we remove unnecessary restrictions like a finite, discrete, or bounded search space. As corollaries, the same is true for our upper bounds in the case of variable and multiplicative drift
Frahnow, Clemens; Kötzing, Timo Ring Migration Topology Helps Bypassing Local OptimaParallel Problem Solving From Nature (PPSN) 2018: 129–140
Running several evolutionary algorithms in parallel and occasionally exchanging good solutions is referred to as island models. The idea is that the independence of the different islands leads to diversity, thus possibly exploring the search space better. Many theoretical analyses so far have found a complete (or sufficiently quickly expanding) topology as underlying migration graph most efficient for optimization, even though a quick dissemination of individuals leads to a loss of diversity. We suggest a simple fitness function Fork with two local optima parametrized by \(r \geq 2\) and a scheme for composite fitness functions. We show that, while the (1+1) EA gets stuck in a bad local optimum and incurs a run time of \(\Theta(n^{2r})\) fitness evaluations on Fork, island models with a complete topology can achieve a run time of \(\Theta(n^{1.5r})\) by making use of rare migrations in order to explore the search space more effectively. Finally, the ring topology, making use of rare migrations and a large diameter, can achieve a run time of \(\tilde{\Theta}(n^r)\), the black box complexity of Fork. This shows that the ring topology can be preferable over the complete topology in order to maintain diversity.
Friedrich, Tobias; Göbel, Andreas; Quinzan, Francesco; Wagner, Markus Heavy-tailed Mutation Operators in Single-Objective Combinatorial OptimizationParallel Problem Solving From Nature (PPSN) 2018: 134–145
A core feature of evolutionary algorithms is their mutation operator. Recently, much attention has been devoted to the study of mutation operators with dynamic and non-uniform mutation rates. Following up on this line of work, we propose a new mutation operator and analyze its performance on the (1+1) Evolutionary Algorithm (EA). Our analyses show that this mutation operator competes with pre-existing ones, when used by the (1+1)EA on classes of problems for which results on the other mutation operators are available. We present a jump function for which the performance of the (1+1)EA using any static uniform mutation and any restart strategy can be worse than the performance of the (1+1)EA using our mutation operator with no restarts. We show that the (1+1)EA using our mutation operator finds a (1/3)-approximation ratio on any non-negative submodular function in polynomial time. This performance matches that of combinatorial local search algorithms specifically designed to solve this problem. Finally, we evaluate experimentally the performance of the (1+1)EA using our operator, on real-world graphs of different origins with up to \(\sim\)37,000 vertices and \(\sim\)1.6 million edges. In comparison with uniform mutation and a recently proposed dynamic scheme our operator comes out on top on these instances.
Chauhan, Ankit; Lenzner, Pascal; Molitor, Louise Schelling Segregation with Strategic AgentsSymposium on Algorithmic Game Theory (SAGT) 2018
Schelling’s segregation model is a landmark model in sociology. It shows the counter-intuitive phenomenon that residential segregation between individuals of different groups can emerge even when all involved individuals are tolerant. Although the model is widely studied, no pure game-theoretic version where rational agents strategically choose their location exists. We close this gap by introducing and analyzing generalized game-theoretic models of Schelling segregation, where the agents can also have individual location preferences. For our models we investigate the convergence behavior and the efficiency of their equilibria. In particular, we prove guaranteed convergence to an equilibrium in the version which is closest to Schelling’s original model. Moreover, we provide tight bounds on the Price of Anarchy.
Cseh, Ágnes; Fleiner, Tamás The Complexity of Cake Cutting with Unequal SharesSymposium Algorithmic Game Theory (SAGT) 2018: 19–30
An unceasing problem of our prevailing society is the fair division of goods. The problem of proportional cake cutting focuses on dividing a heterogeneous and divisible resource, the cake, among n players who value pieces according to their own measure function. The goal is to assign each player a not necessarily connected part of the cake that the player evaluates at least as much as her proportional share. In this paper, we investigate the problem of proportional division with unequal shares, where each player is entitled to receive a predetermined portion of the cake. Our main contribution is threefold. First we present a protocol for integer demands that delivers a proportional solution in fewer queries than all known algorithms. Then we show that our protocol is asymptotically the fastest possible by giving a matching lower bound. Finally, we turn to irrational demands and solve the proportional cake cutting problem by reducing it to the same problem with integer demands only. All results remain valid in a highly general cake cutting model, which can be of independent interest.
Friedrich, Tobias; Rothenberger, Ralf Sharpness of the Satisfiability Threshold for Non-Uniform Random k-SATTheory and Applications of Satisfiability Testing (SAT) 2018: 273–291
Best Paper Award
We study non-uniform random k-SAT on n variables with an arbitrary probability distribution p on the variable occurrences. The number \(t = t(n)\) of randomly drawn clauses at which random formulas go from asymptotically almost surely (a.a.s.) satisfiable to a.a.s. unsatisfiable is called the satisfiability threshold. Such a threshold is called sharp if it approaches a step function as n increases. We show that a threshold t(n) for random k-SAT with an ensemble \((p_n)_{n\in\mathbb{N}}\) of arbitrary probability distributions on the variable occurrences is sharp if \(\|p\|_2^2 = O_n(t^{-2/k})\) and \(\|p\|_\infty\) \(= o_n(t^-k/(2k-1) \log^{-(k-1)/(2k-1)(t))\). This result generalizes Friedgut’s sharpness result from uniform to non-uniform random k-SAT and implies sharpness for thresholds of a wide range of random k-SAT models with heterogeneous probability distributions, for example such models where the variable probabilities follow a power-law distribution.
Battaglia, Francesco; Cucina, Domenico; Rizzo, Manuel Generalized Periodic Autoregressive Models for Trend and Seasonality Varying Time SeriesScientific Meeting of the Italian Statistical Society (SIS) 2018
Many nonstationary time series exhibit changes in the trend and seasonality structure, that may be modeled by splitting the time axis into different regimes. We propose multi-regime models where, inside each regime, the trend is linear and seasonality is explained by a Periodic Autoregressive model. In addition, for achieving parsimony, we allow season grouping, i.e. seasons may consist of one, two, or more consecutive observations. Identification is obtained by means of a Genetic Algorithm that minimizes an identification criterion.
Bläsius, Thomas; Eube, Jan; Feldtkeller, Thomas; Friedrich, Tobias; Krejca, Martin S.; Lagodzinski, J. A. Gregor; Rothenberger, Ralf; Severin, Julius; Sommer, Fabian; Trautmann, Justin Memory-restricted Routing With Tiled Map DataSystems, Man, and Cybernetics (SMC) 2018: 3347–3354
Modern routing algorithms reduce query time by depending heavily on preprocessed data. The recently developed Navigation Data Standard (NDS) enforces a separation between algorithms and map data, rendering preprocessing inapplicable. Furthermore, map data is partitioned into tiles with respect to their geographic coordinates. With the limited memory found in portable devices, the number of tiles loaded becomes the major factor for run time. We study routing under these restrictions and present new algorithms as well as empirical evaluations. Our results show that, on average, the most efficient algorithm presented uses more than 20 times fewer tile loads than a normal A*.
Bilò, Davide; Lenzner, Pascal On the Tree Conjecture for the Network Creation GameSymposium on the Theoretical Aspects of Computer Science (STACS) 2018: 14:1–14:15
Selfish Network Creation focuses on modeling real world networks from a game-theoretic point of view. One of the classic models by Fabrikant et al. [PODC'03] is the network creation game, where agents correspond to nodes in a network which buy incident edges for the price of alpha per edge to minimize their total distance to all other nodes. The model is well-studied but still has intriguing open problems. The most famous conjectures state that the price of anarchy is constant for all \($\alpha$\) and that for \($\alpha \geq n$\) all equilibrium networks are trees. We introduce a novel technique for analyzing stable networks for high edge-price alpha and employ it to improve on the best known bounds for both conjectures. In particular we show that for \($\alpha > 4n-13$\) all equilibrium networks must be trees, which implies a constant price of anarchy for this range of alpha. Moreover, we also improve the constant upper bound on the price of anarchy for equilibrium trees.
Bläsius, Thomas; Friedrich, Tobias; Katzmann, Maximilian; Krohmer, Anton; Striebel, Jonathan Towards a Systematic Evaluation of Generative Network ModelsWorkshop on Algorithms and Models for the Web Graph (WAW) 2018: 99–114
Generative graph models play an important role in network science. Unlike real-world networks, they are accessible for mathematical analysis and the number of available networks is not limited. The explanatory power of results on generative models, however, heavily depends on how realistic they are. We present a framework that allows for a systematic evaluation of generative network models. It is based on the question whether real-world networks can be distinguished from generated graphs with respect to certain graph parameters. As a proof of concept, we apply our framework to four popular random graph models (Erdős-Rényi, Barabási-Albert, Chung-Lu, and hyperbolic random graphs). Our experiments for example show that all four models are bad representations for Facebook's social networks, while Chung-Lu and hyperbolic random graphs are good representations for other networks, with different strengths and weaknesses.
Fischbeck, Philipp On the Effectiveness of Data Reduction for Covering Problems in Real-World Transit NetworksMaster Thesis, Hasso Plattner Institute 2018
Given a graph and a set of paths, we want to find the minimal set of vertices such that each path is covered by at least one chosen vertex. Although this problem is NP-hard, real-world instances can be solved almost completely by a set of simple reduction rules. We examine this behavior from a theoretical and empirical perspective. First, we show that the problem is easy to solve for forests and cycle graphs. However, the problem is NP-hard for a feedback vertex number of 2 and a treewidth of 3. This indicates that the explanation for the effectiveness does not lie in the graph representation of problem instances. Thus, we examine the Hitting Set problem that arises when ignoring the graph representation and interpreting a path as a mere set of vertices. Through this relation, we show that the problem remains NP-hard even for very strong restrictions. Hitting Set instances that have a representation as a path graph can be recognized as such in polynomial time. However, finding the graph representation with the fewest edges is NP-hard. Based on the analysis of publicly available transit datasets, we show that the real-world instances are clustered and have heterogeneous stations, with the number of lines per station distributed according to a power law. We describe a model to generate random problem instances with adjustable clustering and heterogeneity. We use this model to show that while the heterogeneity does positively influence the effectiveness of the reduction rules, the largest effect comes from the clustering. Lastly, we show a strong relation between the reduction rules for the Hitting Set problem and reduction rules for the Independent Set problem on the intersection graph of the family of sets. We prove that the size of any independent set is a lower bound on the size of the maximum hitting set and show that the two bounds are very close for real-world instances. We show that the reduction rules need to be effective for Independent Set in order for them to be effective for Hitting Set.
Neubert, Stefan Mechanisms for Network CreationMaster Thesis, Hasso Plattner Institute 2018
We introduce and analyze the possibilities of a new model for network creation by autonomous selfish agents: Unlike in typical network formation games such as the well-known model of Fabrikant et al. [PODC'03], the final network is not directly shaped by the players of a game. Instead, we design mechanisms that take edge preferences of agents as input for a social choice function and return a network that respects those preferences. In addition, it caters for compliance with global restrictions on the network topology and tries to establish several properties, such as global efficiency, maximizing the individual utility of agents, and building stable networks, especially in comparison to the result of an anarchic network formation. The mechanism also aims to produce Nash equilibria and encourages agents to honestly declare their preferences instead of playing strategically. The mechanism approach is a true superset of both centralized network design and uncoordinated network creation games. To the best of our knowledge this is the first attempt to explore the realm inbetween those extremes.
Rizzo, Manuel Contributions on Evolutionary Computation for Statistical InferenceDoctoral Dissertation, Sapienza University of Rome 2018