6 No Going Back: The Fitness Level Method (FLM)

Some processes $(X_t)_{t \in \natnum}$ are monotone, that is, we have $\forall t: X_t \leq X_{t+1}$. Monotone processes occur frequently in the analysis of heuristic optimization, since the best fitness found so far is a typical process considered. For some such processes, simpler (and sometimes stronger) analyses are possible than with drift theorems allowing for non-monotone processes (see Proposition 6.2 [Application to Rumor Spreading] for an example).
Wegener [Weg01Theoretical Aspects of Evolutionary Algorithms. Ingo Wegener. ICALP 2001.] proposed the following method, called the fitness level method (FLM). We partition the search space into a number $m$ of sections (“levels”) in a linear fashion, so that all elements of later levels have better fitness than all elements of earlier levels. For the algorithm to be analyzed we regard the best-so-far individual and the level it is in. Since the best-so-far individual can never move to lower levels, it will visit each level at most once (possibly staying there for some time). Suppose we can show that, for any level $i \lt m$ which the algorithm is currently in, the probability to leave this level is at least $p_i$. Then, bounding the expected waiting for leaving a level $i$ by $1/p_i$ (geometric distribution) and pessimistically assuming that we visit (and thus have to leave) each level $i \lt m$ before reaching the target level $m$, we can derive an upper bound for the optimization time of $$ \sum_{i=1}^{m-1} \frac{1}{p_i}. $$ The fitness level method allows for simple and intuitive proofs and has therefore frequently been applied. Variations of it come with tail bounds [Wit14Fitness levels with tail bounds for the analysis of randomized search heuristics. Carsten Witt. Inf. Process. Lett. 2014.], work for parallel EAs [LS14General Upper Bounds on the Runtime of Parallel Evolutionary Algorithms. Jörg Lässig, Dirk Sudholt. Evol. Comput. 2014.] or regard populations [Wit06Runtime Analysis of the (mu + 1) EA on Simple Pseudo-Boolean Functions. Carsten Witt. Evol. Comput. 2006.]. A similar analysis in levels can be made for non-elitist EAs, but here it is crucially possible (and sometimes not unlikely) to lose a level. See Theorem 5.22 [Level-Based Theorem] for a corresponding theorem along a discussion.
We state the fitness level method (FLM) formally as follows.

Theorem 6.1 [Fitness Level Method (FLM)]. Let $(X_t)_{t \in \natnum}$ be a monotone process on $[m]$. For all $i \in [m-1]$, let $p_i$ be a lower bound on the probability of a state change of $(X_t)_{t \in \natnum}$, conditional on being in state $i$, formally: for all $t$ with $\Pr{X_t=i} \gt 0$, $$ \Pr{X_{t+1} \gt i \mid X_0,…, X_t, X_t=i} \geq p_i. $$ Let $T$ be the random variable describing the first time $t$ such that $X_t = m$. Then $$ \Ew{T} \leq \sum_{i=1}^{m-1} \frac{1}{p_i}. $$

Proof. For all $i \in [m-1]$, we let $S_i = \set{t \in \natnum}{X_t = i}$. Since $(X_t)_{t \in \natnum}$ is monotone, each $S_i$ is a discrete interval. Calling the leaving of $S_i$ a success event, we can use Theorem 2.9 [Geometric Distribution] to see that the expected size of each $S_i$ is at most $1/p_i$. Since $T = \sum_{i=1}^{m-1}|S_i|$, the theorem follows.

Note. The main strength of the fitness level method over drift theorems is that the chance to leave a level $i$, $p_i$, can depend arbitrarily on $i$. In contrast, in Theorem 5.13 [Variable Drift], one of the most general drift theorems, the drift has to depend monotonically on the state of the process.
Conversely, the main strength of drift theorems over the fitness level method is that the process is allowed to be non-monotone, so that we can work with other processes than those based on fitness.

The following example shows a toy application of the fitness level method where drift theorems are not easily applicable. The setting is borrowed from the area of rumor spreading, see, for example, [DK14Tight Analysis of Randomized Rumor Spreading in Complete Graphs. Benjamin Doerr, Marvin Künnemann. ANALCO 2014.] and also [DFF+19Island Models Meet Rumor Spreading. Benjamin Doerr, Philipp Fischbeck, Clemens Frahnow, Tobias Friedrich, Timo Kötzing, Martin Schirneck. Algorithmica 2019.] for an application in the area of randomized search heuristics.

Proposition 6.2 [Application to Rumor Spreading]. Let $n \in \natnum$. Suppose $n$ people each want to obtain a certain information, and suppose in iteration $0$ exactly one of them knows this information. In each iteration, one of the $n$ people is chosen uniformly at random and if this person does not know the information, the person will contact another person chosen uniformly at random. If this other person knows the information, then the calling person from now on also knows the information. Then it takes, in expectation, at most $2n(\ln(n-1)+1)$ iterations until all persons know the rumor.

Proof. We let, for each $t \in \natnum$, $X_t$ be the number of persons who know the rumor after $t$ iterations. Then $(X_t)_{t \in \natnum}$ is a monotone process on $[n]$. Let some iteration $t$ be given. If, after $t$ iterations, exactly $i \in [n-1]$ persons know the rumor, then the probability that in the next iteration an uninformed person is chosen to make a call is $(n-i)/n$. The probability that this person calls an informed person is independent of that probability and $i/(n-1)$. Thus, the chance $p_i$ to “leave state $i$” is $$ p_i = \frac{n-i}{n} \; \frac{i}{n-1}. $$ By Theorem 6.1 [Fitness Level Method (FLM)], the total time until all people are informed is thus at most $$ \sum_{i=1}^{n-1} \frac{1}{p_i} = \sum_{i=1}^{n-1} \frac{n(n-1)}{(n-i)i}. $$ [Comment: For didactic reasons, we give two different ways of bounding this sum; the second uses calculus and gives the better bound.] A first way to bound the sum is by splitting it into two and using worst case estimates to simplify. Let $k = \lfloor n/2 \rfloor$; we have \begin{align*} \sum_{i=1}^{n-1} \frac{n(n-1)}{(n-i)i} & = \sum_{i=1}^{k} \frac{n(n-1)}{(n-i)i} + \sum_{i=k+1}^{n-1} \frac{n(n-1)}{(n-i)i}\\ & \leq \sum_{i=1}^{k} \frac{n(n-1)}{(n-k)i} + \sum_{i=k+1}^{n-1} \frac{n(n-1)}{(n-i)k}\\ & = \sum_{i=1}^{k} \frac{n(n-1)}{(n-k)i} + \sum_{i=1}^{n-k-1} \frac{n(n-1)}{i \; k}\\ & = \frac{n(n-1)}{n-k} \sum_{i=1}^{k} \frac{1}{i} + \frac{n(n-1)}{k}\sum_{i=1}^{n-k-1} \frac{1}{i}\\ & \leq \frac{n(n-1)}{n-k} \left(\ln(k)+1\right) + \frac{n(n-1)}{k}\left(\ln(n-k-1)+1\right). \end{align*} The last inequality uses Lemma 9.11 [Upper Bound on the Harmonic Sum]. By bounding $1/(n-k) \leq 2/n$ and $1/k \leq 2/(n-1)$ we can further bound the term by $$ 2(n-1)\left(\ln(k)+1\right)+2n\left(\ln(n-k-1)+1\right) \leq 4n\left(\ln(n)+1\right). $$ We can get a tighter bound as follows by turning to calculus. The function $$ f\colon [1,n-1] \rightarrow \realnum, x \mapsto \frac{1}{(n-x)x} $$ has a minimum at $n/2$, is before that monotone decreasing and afterwards monotone increasing. Thus, we can use Lemma 9.12 [Upper Bounding Sum by Integral] twice, once on the interval $[1,\lfloor n/2 \rfloor]$ and once on $[\lceil n/2 \rceil,n-1]$ to bound $$ \sum_{i=1}^{n-1} \frac{n(n-1)}{(n-i)i} \leq n + n + n(n-1) \int_1^{n-1} f(x) \mathrm{d}x. $$ Note that the summands before the integral are the first and the last summand of the large sum (which are not covered by the cited lemma). The indefinite integral over $f$ is given by the function $$ x \mapsto \frac{\ln(x)-\ln(n-x)}{n}, $$ which can be seen by taking the derivative of that function. Using the integral bounds, we arrive at $$ \sum_{i=1}^{n-1} \frac{n(n-1)}{(n-i)i} \leq 2n + (n-1)\left(\ln(n-1)-\ln(1) - \ln(1)+\ln(n-1)\right) = 2(n-1)\ln(n-1) + 2n $$ as desired.

Note that due to the drift being strongest in the middle of the state space, no other drift theorem is directly available without losing asymptotically: one could apply the additive drift theorem to obtain a bound of $O(n^2)$ by using that the drift is $\Omega(1/n)$. Another choice is to us an argument in phases: since the process is monotone, one can analyze the time until reaching $n/2$ separately from the remainder. This would then potentially allow to use some version of multiplicative up-drift (for the first phase) and multiplicative drift (for the second phase). However, this would lead to an unnecessarily complicated analysis.

While very effective for proving upper bounds, it seems much harder to use fitness level arguments to prove lower bounds. The first to devise a lower bound method based on fitness levels that gives competitive bounds was Sudholt [Sud13A New Method for Lower Bounds on the Running Time of Evolutionary Algorithms. Dirk Sudholt. IEEE Trans. Evol. Comput. 2013.]. Next we see a lower bound from [DK21Lower bounds from fitness levels made easy. Benjamin Doerr, Timo Kötzing. GECCO 2021.].

Theorem 6.3 [Fitness Level Method with Visit Probabilities, Lower Bound]. Let $(X_t)_{t \in \natnum}$ be a monotone process on $[m]$. For all $i \in [m-1]$, let $p_i$ be an upper bound on the probability of a state change of $(X_t)_{t \in \natnum}$, conditional on being in state $i$. Furthermore, let $v_i$ be a lower bound on the probability of there being a $t$ such that $X_t = i$ (the visit probability of level $i$). Then the expected time for $(X_t)_{t \in \natnum}$ to reach the state $m$ is $$ \Ew{T} \geq \sum_{i=1}^{m-1} \frac{v_i}{p_i}. $$

Proof. We proceed as in the proof for Theorem 6.1 [Fitness Level Method (FLM)]. For all $i \in [m-1]$, we let $S_i = \set{t \in \natnum}{X_t = i}$. With probability at most $1-v_i$ we have that $S_i = \emptyset$. Again using Theorem 2.9 [Geometric Distribution], we see that the expected size of each non-empty $S_i$ is at least $1/p_i$. Since $T = \sum_{i=1}^{m-1}|S_i|$, the theorem follows with linearity of expectation.

A corresponding upper bound [DK21Lower bounds from fitness levels made easy. Benjamin Doerr, Timo Kötzing. GECCO 2021.] follows with analogous arguments and shows the tightness of the approach, with the bounds required on $p_i$ and $v_i$ reversed.

Theorem 6.4 [Fitness Level Method with Visit Probabilities, Upper Bound]. Let $(X_t)_{t \in \natnum}$ be a monotone process on $[m]$. For all $i \in [m-1]$, let $p_i$ be an lower bound on the probability of a state change of $(X_t)_{t \in \natnum}$, conditional on being in state $i$. Furthermore, let $v_i$ be an upper bound on the probability of there being a $t$ such that $X_t = i$. Then the expected time for $(X_t)_{t \in \natnum}$ to reach the state $m$ is $$ \Ew{T} \leq \sum_{i=1}^{m-1} \frac{v_i}{p_i}. $$

Proof. Analogous to the proof of Theorem 6.3 [Fitness Level Method with Visit Probabilities, Lower Bound].

In a typical application of the fitness level method, finding good estimates for the leaving probabilities is easy. It is more complicated to estimate the visit probabilities accurately, the following lemma from [DK21Lower bounds from fitness levels made easy. Benjamin Doerr, Timo Kötzing. GECCO 2021.] offers an option.

Lemma 6.5 [Computing Visit Probabilities]. Let $(X_t)_{t \in \natnum}$ be a monotone process on $[m]$. Further, suppose that $(X_t)_{t \in \natnum}$ reaches state $m$ after a finite time with probability $1$. Let $i \lt m$ be given. Suppose there is $v_i$ such that, for all $t \in \natnum$ with $\Pr{X_{t+1} \geq i \gt X_t} \gt 0$, $$ \Pr{X_{t+1} = i \mid X_0,…, X_t; X_{t+1} \geq i \gt X_t} \geq v_i, $$ and $$ \Pr{X_0 = i \mid X_0 \geq i} \geq v_i. $$ Then $v_i$ is a lower bound for visiting level $i$ as required by Theorem 6.3 [Fitness Level Method with Visit Probabilities, Lower Bound].

An analogous bound for upper bounds on visit probabilities also holds.

6.1 Applications

As a first application of these methods we now determine a lower bound for the coupon collector problem (see Theorem 2.7 [Coupon Collector with Multiplicative Drift]).

Theorem 6.6 [Coupon Collector, Lower Bound via Fitness Levels]. Suppose we want to collect at least one of each kind of $n \in \natnum_{\geq 1}$ coupons. Each round, we are given one coupon chosen uniformly at random from the $n$ kinds. Then, in expectation, we have to collect for at least $n(1+ \ln n)$ iterations.

Proof. Let $X_t$ be the number of coupons after $t$ iterations. Note that this process is monotone and, since it gains at most one in any iteration, visits all elements of $[n-1]$ before reaching the target of $n$. The probability of making progress (of $1$) with coupon $t+1$ is $p_i= (n-i)/n$, since $n-i$ coupons are missing and each has a probability of $1/n$, and all these events are disjoint. An application of both Theorem 6.3 [Fitness Level Method with Visit Probabilities, Lower Bound] and Theorem 6.4 [Fitness Level Method with Visit Probabilities, Upper Bound] gives an exact value of the expected time to find all $n$ coupons of $$ \sum_{i=0}^{n-1} \frac{v_i}{p_i} = \sum_{i=0}^{n-1} \frac{n}{n-i} = nH_{n}, $$ Where, for each $m \in \natnum_+$, we use $H_m$ to denote the $m$th harmonic number.

As a second application we now determine the precise run time of the $(1+1)$ EA on LeadingOnes via the two fitness level theorems. This optimization time was first established in [BDN10Optimal Fixed and Adaptive Mutation Rates for the LeadingOnes Problem. Süntje Böttcher, Benjamin Doerr, Frank Neumann. PPSN (1) 2010.].

Theorem 6.7 [Run Time of $(1+1)$ EA on LeadingOnes]. Consider the $(1+1)$ EA optimizing LeadingOnes with mutation rate $p$. Let $T$ be the (random) time for the $(1+1)$ EA to find the optimum. Then $$ \Ew{T} = \frac{1}{2} \sum_{i=0}^{n-1} \frac{1}{(1-p)^i p}. $$

Proof. We want to apply Theorem 6.4 [Fitness Level Method with Visit Probabilities, Upper Bound] and Theorem 6.3 [Fitness Level Method with Visit Probabilities, Lower Bound] simultaneously. For all $t \in \natnum$, we let $X_t$ be the LeadingOnes-value of the individual which the $(1+1)$ EA has found after $t$ iterations. Now we need a precise result for the probability to leave a level and for the probability to visit a level. First, we consider the probability $p_i$ to leave a given level $i \lt n$. Suppose the algorithm has a current search point in level $i$, so it has $i$ leading $1$s and then a $0$. The algorithm leaves level $A_i$ now if and only if it flips the first $0$ of the bit string (probability of $p$) and no previous bits (probability $(1-p)^i$). Hence, $p_i = p(1-p)^i$. Next we consider the probability $v_i$ to visit a level $i$. We claim that it is exactly $1/2$, following reasoning given in several places before [DJW02On the analysis of the (1+1) evolutionary algorithm. Stefan Droste, Thomas Jansen, Ingo Wegener. Theor. Comput. Sci. 2002., Sud13A New Method for Lower Bounds on the Running Time of Evolutionary Algorithms. Dirk Sudholt. IEEE Trans. Evol. Comput. 2013.]. We want to use Lemma 6.5 [Computing Visit Probabilities] and its analogue for upper bounds. Let $i$ be given. For the initial search point, if it is at least on level $i$ (the condition considered by the lemma), the individual is on level $i$ if and only if the $i+1$st bit is a $0$, so exactly with probability $1/2$ as desired for both bounds. Before an individual with at least $i$ leading $1$s is created, the bit at position $i+1$ remains uniformly random (this can be seen by induction: it is uniform at the beginning and does not experience any bias in any iteration while no individual with at least $i$ leading $1$s is created). Once such an individual is created, if the bit at position $i+1$ is $1$, the level $i$ is skipped, otherwise it is visited. Thus, the algorithm skips level $i$ with probability exactly $1/2$, giving $v_i = 1/2$. With these exact values for the $p_i$ and $v_i$, Theorem 6.4 [Fitness Level Method with Visit Probabilities, Upper Bound] and Theorem 6.3 [Fitness Level Method with Visit Probabilities, Lower Bound] immediately yield the claim.

By computing the geometric series in Theorem 6.7 [Run Time of $(1+1)$ EA on LeadingOnes], we obtain as a (well-known) corollary that the $(1+1)$ EA with the classic mutation rate $p = 1/n$ optimizes LeadingOnes in an expected run time of $n^2\frac{e-1}{2}(1\pm o(1))$.

This work is licensed as Creative Commons Attribution - ShareAlike 4.0 International.
All mathematical statement in this script are freely usable, of course. As a collection and with comments they are licensed as stated above.