**Chair for Algorithm Engineering**

Hasso Plattner Institute

Office: A-1.6

E-Mail: Marcus.Pappik(at)hpi.de

Links: Homepage

Supervisor: Prof. Dr. Tobias Friedrich

Advisors: Dr. Andreas Göbel, Dr. Martin Krejca

We investigate the complexity of algorithmic problems from statistical physics. More precisely, we are interested in quantitative relations between continuous models, for which algorithmically little is known, and discrete models, which have been studies extensively in the past decades. This way, we can obtain new efficient approximations for continuous models by using known algorithms for discrete models as tools.

## Introduction

Statistical physics commonly describes systems of multiple particles via probability distributions. This is usually done by assigning each possible state of the system some weight that is inversely related to the energy of the state (i.e., high energy states have low weight). The *Gibbs distribution* is then defined as the probability distribution that assigns each state a probability proportionally to its weight. Sampling from this distribution can be seen as a way of simulating the system.

The required normalizing constant of the distribution is called the *partition function* of the system. A variety of other relevant quantities of the system can be derived from it, and computing the partition function and sampling from the Gibbs distribution are closely related. However, the partition function is notoriously hard to calculate explicitely. This leads to the obvious question in which cases an efficient approximation of the partition function of a system can be obtained algorithmically.

Investigating such computational properties has led to a rich exchange between statistical physics and theoretical computer science. More precisely, proof techniques from statistical physics have been turned into algorithmic tools (e.g., polymer models) and the deeper computational understanding of such models resulted in new insights in their physical behaviour (e.g., new phase transition results). Nevertheless, many questions remain still open. To point out some of these open questions, it helps to roughly devide models in statistical physics into two categories. Namely, models that use a discrete representation of space and models in continuous space.

### Discrete Models (Spin Systems)

Discrete models represent space usually using an undirected graph \(G=(V, E)\). In this setting, the vertices of a graph represent positions in space and edges represent possible interactions. A state of the system consists of an assignment \(\sigma: V \to S\) where \(S\) is a finite set of spins. Depending on the specific model, the weight of a configuration is then determine by the assigned spins and how they interact along edges. If the underlying graph is finite, the partition function can be expressed as the sum of all configuration weights. As this description is rather abstract, it may help to look at an example.

**Example: the hard-core model**

An instance of the hard-core model is defined by an undirected graph \(G = (V, E)\) and a parameter \(\lambda \in \mathbb{R}_{\ge 0}\). Each state of this model is an assignment \(\sigma: V \to \{0, 1\}\). Each such assignment \(\sigma\) has weight \(w(\sigma) = 0\) if two adjacent vertices have spin 1 (i.e., \(\sigma(v) = 1\) and \(\sigma(w) = 1\) for \((v, w) \in E\)). Otherwise, the weight of \(\sigma\) is \(w(\sigma) = \lambda^{|\sigma|_{1}}\) where \(|\sigma|_{1}\) is the number of vertices \(v \in V\) such that \(\sigma(v)=1\). One intepretation of this model is that vertics represent positions in space, which can be occupied by a particle (spin 1) while no two adjacent positions can be occupied at the same time. The parameter \(\lambda\) influences the expected density of particles in space.

### Continuous Models (Gibbs Point Processes)

A different class of models arises if we do not restrict the underlying space to be a graph but instead allow for continuous spaces. Often Gibbs point processes on measurable and bounded regions of \(d\)-dimensional Euclidean space \( \mathbb{V} \subset \mathbb{R}^{d}\) are considered. The states of such a Gibbs point process are finite point sets in \(\mathbb{V}\), each equipped with a weight depending on the specific model in question. The Gibbs distribution is defined via a probability density with respect to the probability measure of a Poisson point process of intensity 1 that is proportional to these weights. Again, we look at an example for clarification.

**Example: hard-sphere model**

An instance of the hard-sphere model in Euclidean space is defined based on a measurable and bounded region \(\mathbb{V} \subset \mathbb{R}^{d}\) and parameters \(r, \lambda \in \mathbb{R}_{\ge 0}\). Each state of the system is a finite point set \(X \subset \mathbb{V}\). The weight of such a point set is \(w(X) = 0\) if any pair of distinct points \(x, y \in X\) have Euclidean distance \(d(x, y) < 2 r\). Otherwise, the weight is \(w(X) = \lambda^{|X|}\). The Gibbs distribbution is defined via a probability density with respect to the probability measure of a Poisson point process of intensity 1, which is proportional to \(w\). One interpretation of this model is that particles are spherical objects of radius \(r\) with centers distributed according a Poisson point process of intensity \(\lambda\) on \(\mathbb{V}\) conditioned on no two particles overlapping.

## Known Algorithmic Results

Approximating the partition function of discrete models (spin systems) has been studied extensively in the computer science community. One of the most celebrated results in this field is that for all \(\Delta \in \mathbb{N}\) there is a critical threshold \(\lambda_{c}(\Delta)\) such that for all \(\lambda < \lambda_{c}(\Delta)\) the hard-core partition function on a graph of maximum degree \(\Delta\) can be approximated in polynomial time [1, 2], while for \(\lambda > \lambda_{c}(\Delta)\) the problem becomes NP-hard [3, 4]. Interestingly, this threshold \(\lambda_{c}(\Delta)\) was known in statistical physics before as the point of a probabilistic phase transition in infinite \(\Delta\)-regular trees, hinting at a relation between physical and computational phase transitions. Similar results where obtained for a variety of discrete models.

In contrast to that, little is known about the computational nature of continuous models, such as the hard-sphere model. While some results were obtained in terms of sampling from the Gibbs distribution (most of which are non-rigorous), hardly any approximation result is known.

## Our Results: Quantitative Relations Between Discrete and Continuous Models

When looking a the discrete hard-core model and the continuous hard-sphere model, some similarities seem to exist. Both models deal with distributions of particles that exlclude each other based on some neighborhood structure. This is not by accident. In fact, many discrete models were developed with the intention the simplify the combinatorial structure of continuous models while preserving qualitative aspects of their behavior.

Our research aims at proving that these discrete models might as well be used as a quantitative estimate for their continuous counterparts. To this end, we showed in our first paper [5] that for cubic regions in Euclidean space (i.e., \(\mathbb{V} = [0, \ell]^{d}\) for some \(\ell \in \mathbb{R}_{\ge 0}\)) we can construct a hard-core instance that closely approximates the hard-sphere model on \(\mathbb{V}\) in terms of its partition function. Even though the required number of vertices is exponential in the volume of \(\mathbb{V}\), we a polynomial time algorithm for the hard-core partition function can still be obtained. We do this by showing that we can use a Monte-Carlo Markov chain algorithm based on non-local dynamics without explicitely constructing the entire graph. This yields the first rigorous proof of an efficient randomized approximation for hard-sphere partition functions for the entire paramter regime up to \(\lambda < \frac{\text{e}}{2^{d}}\), which coincides with the known regime for absence of a phase transition.

In a more recent work [6], we generalize and improve this result in multiple ways. First, we allow for a broader class of continuous models with hard-constraints, such as the continuous Widom-Rowlinson model. Further, we generalize our construction of the hard-core instance to arbitrary regions \(\mathbb{V} \subset \mathbb{R}^{d}\) that are measurable, bounded and star-convex. Moreover, we show that the required number of points for such regions depends polynomially on the size of certain 'nice' partitionings of \(\mathbb{V}\). For the setting of cubic regions that we considered in our initial paper, this improves the required number of vertices from exponential to quaratic in volume of \(\mathbb{V}\), which we argue to be tight. Due to this improvement, we obtain the first efficient (quasi-polynomial) deterministic approximation algorithm in this setting. Lastely, we show that this result not only applies to one specific way of constructing the hard-core instance, but in fact asymptotically almost surely every uniform geometric random graph on \(\mathbb{V}\) with appropriate edge length can be used.

## Teaching

During my Ph.D. studies, I was **Teaching Assistant** for the following courses:

- Introduction to Quantum Computing (Sommer 2021)
- Theorie der künstlichen Intelligenz (Winter 2020/2021)
- Theory of Evolutionary Algorithms (Summer 2020)

## Publications

2022 [ to top ]

- Friedrich, Tobias; Göbel, Andreas; Krejca, Martin S.; Pappik, Marcus
**A Spectral Independence View on Hard Spheres via Block Dynamics**SIAM Journal on Discrete Mathematics 2022: 2282–2322 - Friedrich, Tobias; Göbel, Andreas; Katzmann, Maximilian; Krejca, Martin S.; Pappik, Marcus
**Algorithms for hard-constraint point processes via discretization**International Computing and Combinatorics Conference (COCOON) 2022 - Baguley, Samuel; Friedrich, Tobias; Timo, Kötzing; Li, Xiaoyue; Pappik, Marcus; Zeif, Ziena
**Analysis of a Gray-Box Operator for Vertex Cover**Genetic and Evolutionary Computation Conference (GECCO) 2022: 1363–1371

2021 [ to top ]

2019 [ to top ]

2018 [ to top ]

2017 [ to top ]

- Kirsch, Louis; Riekenbrauck, Niklas; Thevessen, Daniel; Pappik, Marcus; Stebner, Axel; Kunze, Julius; Meissner, Alexander; Kumar Shekar, Arvind; Müller, Emmanuel
**Framework for Exploring and Understanding Multivariate Correlations**Machine Learning and Knowledge Discovery in Databases (ECML/PKDD) 2017: 404–408

## References

- Dror Weitz. “Counting independent sets up to the tree threshold”. In: Proc. of STOC'06.
- Nima Anari, Kuikui Liu, and Shayan Oveis Gharan. "Spectral independence in high-dimensional expanders and applications to the hardcore model.". In: Proc. of FOCS'20
- Allan Sly. “Computational transition at the uniqueness threshold”. In: Proc. of FOCS'10
- Andreas Galanis, Qi Ge, Daniel Stefankovic, Eric Vigoda, and Linji Yang. “Improved inapproximability results for counting independent sets in the hard-core model”. In: Random Structures and Algorithms 45.1 (2014)
- Tobias Friedrich, Andreas Göbel, Martin Krejca, and Marcus Pappik. "A spectral independence view on hard spheres via block dynamics.". In: Proc. of ICALP'21
- Tobias Friedrich, Andreas Göbel, Maximilian Katzmann, Martin Krejca, and Marcus Pappik. "Algorithms for general hard-constraint point processes via discretization.". currently under review at SODA'22