**Chair for Algorithm Engineering**

Hasso Plattner Institute

Office: K-2.19/20

E-Mail: Marcus.Pappik(at)hpi.de

Links: Homepage

Supervisor: Prof. Dr. Tobias Friedrich

Advisors: Dr. Andreas Göbel, Dr. Martin Krejca

We investigate the complexity of algorithmic problems from statistical physics. Models in statistical physics commonly come tin two different flavours. On the one hand, there are discrete spin system, which algorithmic properties have been studied extensively in the past decades. On the other hand, there are continuous point processes, for which significantly less is known from a computational perspective. The goal of our work is to bridge this gap by introducing new algorithms for continuous point processes, using existing algorithms for discrete spin systems as a foundation.

## Introduction

Statistical physics commonly describes systems of multiple particles via probability distributions. This is usually done by assigning each possible state of the system some weight that is inversely related to the energy of the state (i.e., high energy states have low weight). The *Gibbs distribution* is then defined as the probability distribution that assigns each state a probability proportionally to its weight. Sampling from this distribution can be seen as a way of simulating the system.

The required normalizing constant of the distribution is called the *partition function* of the system. A variety of other relevant quantities of the system can be derived from it, and computing the partition function and sampling from the Gibbs distribution are closely related. However, the partition function is notoriously hard to calculate explicitely. This leads to the obvious question in which cases an efficient approximation of the partition function of a system can be obtained algorithmically.

Investigating such computational properties has led to a rich exchange between statistical physics and theoretical computer science. More precisely, proof techniques from statistical physics have been turned into algorithmic tools (e.g., polymer models) and the deeper computational understanding of such models resulted in new insights in their physical behaviour (e.g., new phase transition results). Nevertheless, many questions remain still open. To point out some of these open questions, it helps to roughly devide models in statistical physics into two categories. Namely, models that use a discrete representation of space and models in continuous space.

### Discrete Models (Spin Systems)

Discrete models represent space usually using an undirected graph \(G=(V, E)\). In this setting, the vertices of a graph represent positions in space and edges represent possible interactions. A state of the system consists of an assignment \(\sigma: V \to S\) where \(S\) is a finite set of spins. Depending on the specific model, the weight of a configuration is then determine by the assigned spins and how they interact along edges. If the underlying graph is finite, the partition function can be expressed as the sum of all configuration weights. As this description is rather abstract, it may help to look at an example.

**Example: the hard-core model**

An instance of the hard-core model is defined by an undirected graph \(G = (V, E)\) and a parameter \(\lambda \in \mathbb{R}_{\ge 0}\). Each state of this model is an assignment \(\sigma: V \to \{0, 1\}\). Each such assignment \(\sigma\) has weight \(w(\sigma) = 0\) if two adjacent vertices have spin 1 (i.e., \(\sigma(v) = 1\) and \(\sigma(w) = 1\) for \((v, w) \in E\)). Otherwise, the weight of \(\sigma\) is \(w(\sigma) = \lambda^{|\sigma|_{1}}\) where \(|\sigma|_{1}\) is the number of vertices \(v \in V\) such that \(\sigma(v)=1\). One intepretation of this model is that vertics represent positions in space, which can be occupied by a particle (spin 1) while no two adjacent positions can be occupied at the same time. The parameter \(\lambda\) influences the expected density of particles in space.

### Continuous Models (Gibbs Point Processes)

A different class of models arises if we do not restrict the underlying space to be a graph but instead allow for continuous spaces. Often Gibbs point processes on measurable and bounded regions of \(d\)-dimensional Euclidean space \( \mathbb{V} \subset \mathbb{R}^{d}\) are considered. The states of such a Gibbs point process are finite point sets in \(\mathbb{V}\), each equipped with a weight depending on the specific model in question. The Gibbs distribution is defined via a probability density with respect to the probability measure of a Poisson point process of intensity 1 that is proportional to these weights. Again, we look at an example for clarification.

**Example: hard-sphere model**

An instance of the hard-sphere model in Euclidean space is defined based on a measurable and bounded region \(\mathbb{V} \subset \mathbb{R}^{d}\) and parameters \(r, \lambda \in \mathbb{R}_{\ge 0}\). Each state of the system is a finite point set \(X \subset \mathbb{V}\). The weight of such a point set is \(w(X) = 0\) if any pair of distinct points \(x, y \in X\) have Euclidean distance \(d(x, y) < 2 r\). Otherwise, the weight is \(w(X) = \lambda^{|X|}\). The Gibbs distribbution is defined via a probability density with respect to the probability measure of a Poisson point process of intensity 1, which is proportional to \(w\). One interpretation of this model is that particles are spherical objects of radius \(r\) with centers distributed according a Poisson point process of intensity \(\lambda\) on \(\mathbb{V}\) conditioned on no two particles overlapping.

## Known Algorithmic Results

Approximating the partition function and sampling from the Gibbs distribution of discrete models (spin systems) has been studied extensively by the computer science community. One of the most celebrated results in this field is that for all \(\Delta \in \mathbb{N}\) there is a critical threshold \(\lambda_{c}(\Delta)\) such that for all \(\lambda < \lambda_{c}(\Delta)\) the hard-core partition function on a graph of maximum degree \(\Delta\) can be approximated in polynomial time [1, 2], while for \(\lambda > \lambda_{c}(\Delta)\) the problem becomes NP-hard [3, 4]. Interestingly, this threshold \(\lambda_{c}(\Delta)\) was known in statistical physics before as the point of a probabilistic phase transition in infinite \(\Delta\)-regular trees, hinting at a relation between physical and computational phase transitions. Similar results where obtained for a variety of discrete models.

In comparison, significantly less is known about the computational properties of continuous models (Gibbs point processes). The following is a summary of our contributions to a better algorithmic understanding of such models. Besides what is shown in the following section, remarkable new results were recently obtained in [8, 9].

## Our Results: Discretization-based Algorithms for Gibbs Point Processes

From a high-level perspective, the discription of discrete spin systems and continuous Gibbs point processes seem to be very similar in flavour. This is not coincidence. In fact, many discrete models were developed with the intention the simplify the combinatorial structure of continuous models while preserving qualitative aspects of their behavior.

Our research aims at proving that these discrete models might as well be used as a quantitative estimate for their continuous counterparts. Along this reasoning, our general approach is to express the computational problems for continuous Gibbs point processes in terms of there corresponding counterparts for carefully constructed discrete spin systems. We then obtain algorithmic results by using known algorithms for those discrete models.

In our first paper [5] we show that, for cubic regions in Euclidean space (i.e., \(\mathbb{V} = [0, \ell]^{d}\) for some \(\ell \in \mathbb{R}_{\ge 0}\)), we can construct a hard-core instance that closely approximates the hard-sphere model on \(\mathbb{V}\) in terms of its partition function. Even though the required number of vertices is exponential in the volume of \(\mathbb{V}\), we a polynomial time algorithm for the hard-core partition function can still be obtained. We do this by showing that we can use a Monte-Carlo Markov chain algorithm based on non-local dynamics without explicitely constructing the entire graph. This yields the first rigorous proof of an efficient randomized approximation for hard-sphere partition functions up to \(\lambda < \frac{\text{e}}{2^{d}}\).

In [6], we improve the previous result by showing that the required number of points for the discretization can be reduced to be polynomial in the volume of \(\mathbb{V}\), given that \(\mathbb{V}\) exhibits a 'nice' partitioning. In particular, for the setting of cubic regions that we considered in our initial paper, this improves the required number of vertices from exponential to quadratic in the volume of \(\mathbb{V}\), which we argue to be tight. This has several algorithmic implications: Firstly, it allows us to obtain the first efficient (quasi-polynomial) deterministic approximation algorithm for the hard-sphere partition function. Further, it allows us to apply our discretization approach to a broader class of continuous models with hard-constraint interactions, such as the continuous Widom-Rowlinson model. Lastly, we show that this result not only applies to one specific way of constructing the hard-core instance, but asymptotically almost every uniform geometric random graph on \(\mathbb{V}\) with appropriate edge length can be used.

In [7], we study the possibility of extending our discretization approach to the much broader class of Gibbs point processes with repulsive pair potentials. Formally, such models are parameterized by a fugacity \(\lambda \in \mathbb{R}\) and a pair potential function \(\phi: \mathbb{R}^{d} \times \mathbb{R}^{d} \to \mathbb{R}_{\ge 0} \cup \{\infty\}\). The model is then defined by assigning every finite point configuration \(X \subset \mathbb{V}\) a density with respect to a Poisson point process with intensity 1 that is proportional to \(\lambda^{|X|} \text{e}^{- \sum_{x, y \in X} \phi(x, y)}\). The main difficulty when discretizing such point processes is that the potential \(\phi\) can have a very pathological structure, and that particles might share 'soft' interactions, which are hard to capture in a hard-core model. We overcome these problems by randomization. More precisely, for a given potential \(\phi\) and a bounded measurable region \(\mathbb{V} \subset \mathbb{R}^{d}\), we carefully construct a class of geometric random graphs, such that their hard-core partition functions closely concentrate around the partition function of the continuous Gibbs point process. This allows us to obtain randomized approximations for the partition functions of repulsive Gibbs point processes up to a fugacity of \(\lambda < \frac{\text{e}}{C_{\phi}}\), where \(C_{\phi}\) denotes the temperedness constant of the potential \(\phi\).

## Teaching

During my Ph.D. studies, I was **Teaching Assistant** for the following courses:

- Introduction to Quantum Computing (Sommer 2021)
- Theorie der künstlichen Intelligenz (Winter 2020/2021)
- Theory of Evolutionary Algorithms (Summer 2020)

Moreover, I was **co-supervising** the following projects and theses:

- Bachelor Project: Algorithm Auditing via Statistical Hypothesis Testing (Winter 2022/2023 + Summer 2023)
- Master's Thesis: Epidemic thresholds for the SIS contact process on stars and clique stars (by Nicolas Klodt, Summer 2021)

## Publications

2023 [ nach oben ]

2022 [ nach oben ]

- Friedrich, Tobias; Göbel, Andreas; Krejca, Martin S.; Pappik, Marcus
**A Spectral Independence View on Hard Spheres via Block Dynamics**SIAM Journal on Discrete Mathematics 2022: 2282–2322 - Friedrich, Tobias; Göbel, Andreas; Katzmann, Maximilian; Krejca, Martin S.; Pappik, Marcus
**Algorithms for hard-constraint point processes via discretization**International Computing and Combinatorics Conference (COCOON) 2022 - Baguley, Samuel; Friedrich, Tobias; Timo, Kötzing; Li, Xiaoyue; Pappik, Marcus; Zeif, Ziena
**Analysis of a Gray-Box Operator for Vertex Cover**Genetic and Evolutionary Computation Conference (GECCO) 2022: 1363–1371

2021 [ nach oben ]

2019 [ nach oben ]

2018 [ nach oben ]

2017 [ nach oben ]

- Kirsch, Louis; Riekenbrauck, Niklas; Thevessen, Daniel; Pappik, Marcus; Stebner, Axel; Kunze, Julius; Meissner, Alexander; Kumar Shekar, Arvind; Müller, Emmanuel
**Framework for Exploring and Understanding Multivariate Correlations**Machine Learning and Knowledge Discovery in Databases (ECML/PKDD) 2017: 404–408

## References

- Dror Weitz. “Counting independent sets up to the tree threshold”. In: Proc. of STOC'06.
- Nima Anari, Kuikui Liu, and Shayan Oveis Gharan. "Spectral independence in high-dimensional expanders and applications to the hardcore model.". In: Proc. of FOCS'20
- Allan Sly. “Computational transition at the uniqueness threshold”. In: Proc. of FOCS'10
- Andreas Galanis, Qi Ge, Daniel Stefankovic, Eric Vigoda, and Linji Yang. “Improved inapproximability results for counting independent sets in the hard-core model”. In: Random Structures and Algorithms 45.1 (2014)
- Tobias Friedrich, Andreas Göbel, Martin Krejca, and Marcus Pappik. "A spectral independence view on hard spheres via block dynamics". In: SIAM Journal on Discrete Mathematics 36.3 (2022)
- Tobias Friedrich, Andreas Göbel, Maximilian Katzmann, Martin Krejca, and Marcus Pappik. "Algorithms for hard-constraint point processes via discretization". In: Proc. of COCOON'22
- Tobias Friedrich, Andreas Göbel, Maximilian Katzmann, Martin Krejca, and Marcus Pappik. "Using random graphs to sample repulsive Gibbs point processes with arbitrary-range potentials". currently under review
- Marcus Michelen, Will Perkins. "Strong spatial mixing for repulsive point processes". In: Journal of Statistical Physics (to appear)
- Matthew Jenssen, Marcus Michelen, Mohan Ravichandran. "Quasipolynomial-time algorithms for repulsive Gibbs point processes".