7 A Different Perspective: Fixed Budget Optimization
In the previous chapters we have seen many theorems regarding the
first hitting time of a process. This answers the question: “How much time do I have to invest until a desired outcome?” Sometimes we want to answer a different question: “I have a
fixed budget $t_0$ of time available; what performance can I expect?”
Furthermore, fixed-budget results that hold with high probability are crucial for the analysis of algorithm configurators [
HOS19On the impact of the cutoff time on the performance of algorithm configurators. George T. Hall, Pietro S. Oliveto, Dirk Sudholt. GECCO 2019.]. These configurators test different algorithms for fixed budgets in order to make statements about their appropriateness in a given setting.
In this chapter we want to discuss general tools for fixed-budget analyses. We still want to use knowledge about step-wise changes and translate them into the global view, just as for the drift theorems for first hitting times.
We start by analyzing the most basic setting in
Section 7.1 [The Additive Case] and generalize it in
Section 7.2 [Variable Fixed Budget Drift]. We show sample results derived with these methods in
Section 7.3 [Applications to OneMax and LeadingOnes].
7.1 The Additive Case
We start with the simple case of additive drift. If we expect to go down by $\delta$ in each iteration, then, after $t$ iterations, we expect to be down $t \delta$, as would be the case for a completely deterministic process. Here the proof is simple and instructive.
Theorem 7.1 [Additive Fixed-Budget Drift].
Let $(X_t)_{t \in \natnum}$, be an
integrable random process on $\realnum$. Suppose there is a $\delta \in \realnum_{+}$ so that we have the drift condition
- (D)
- $\Ew{X_t - X_{t+1} \mid X_0,…, X_t} \geq \delta$.
Thus, the drift condition is equivalent to
- (D')
- $\Ew{X_{t+1} \mid X_0,…, X_t} \leq X_t - \delta$.
Then, for all $t \geq 0$,
$$
\Ew{X_t \mid X_0} \leq X_0 - t \delta.
$$
Proof.
We prove the theorem by induction on $t$, with a trivial induction basis. Suppose now the statement is true for some $t \geq 0$ (IH). Using the
law of total expectation (LTE), we have
\begin{align*}
\Ew{X_{t+1} \mid X_0}
& \eqnComment{(LTE)}{=} \Ew{ \Ew{X_{t+1} \mid X_0,…, X_t} \mid X_0}\\
& \eqnComment{(D')}{\leq} \Ew{ X_t - \delta \mid X_0}\\
& \eqnComment{(IH)}{\leq} X_0 - (t+1) \delta.
\end{align*}
This concludes the induction.
Note that this version does not take into account that drift might only hold before a target has been reached. We refer to this setting as
unlimited time. The next theorem considers a potential end point. Note that the proof follows the proof of Theorem 1 in [
Len20Drift Analysis. Johannes Lengler. In: Theory of Evolutionary Computation 2020.].
Theorem 7.2 [Additive Fixed-Budget Drift, Limited Time].
Let $(X_t)_{t \in \natnum}$, be an
integrable random process on $\realnum$ and let $T$ be any random variable on $\natnum$. Suppose that there is a $\delta \in \realnum_{+}$ so that we have the following drift condition.
- (D)
- For all $t \in \natnum$ with $\Pr{t \lt T} \gt 0$, $\Ew{X_t - X_{t+1} \mid t \lt T} \geq \delta$.
- (D')
- For all $t \in \natnum$ with $\Pr{t \geq T} \gt 0$, $\Ew{X_t - X_{t+1} \mid t \geq T} \geq 0$.
Then, for all $t \in \natnum$,
$$
\Ew{X_t \mid X_0} \leq X_0 - t\delta \Pr{t \leq T}.
$$
Proof.
First, we show that, for all $t \in \natnum$,
\begin{equation}
\Ew{ X_{t+1} - X_t \mid X_0} \leq -\delta\Pr{t \lt T}. \tag{$\ast$}
\end{equation}
We distinguish three cases. (1) If $\Pr{t \lt T} = 0$, then Equation ($\ast$) follows from
(D'). (2) If $\Pr{t \lt T} = 1$, then Equation ($\ast$) follows from
(D). (3) Otherwise, we use the
law of total expectation to get
\begin{align*}
\Ew{ X_{t+1} - X_t \mid X_0} & = \Ew{ X_{t+1} - X_t \mid X_0, t \lt T}\Pr{t \lt T} + \Ew{ X_{t+1} - X_t \mid X_0, t \geq T}\Pr{t \geq T}\\
& \eqnComment{(D)}{\leq} -\delta \Pr{t \lt T} + \Ew{ X_{t+1} - X_t \mid X_0, t \geq T}\Pr{t \geq T}\\
& \eqnComment{(D')}{\leq} -\delta \Pr{t \lt T}.
\end{align*}
We now prove the theorem by induction on $t \in \natnum$, with a trivial induction basis for $t=0$. Suppose now the statement is true for some $t \geq 0$ (IH). We now have
\begin{align*}
\Ew{X_{t+1} \mid X_0}
& = \Ew{ X_{t+1} - X_t + X_t \mid X_0}\\
& = \Ew{ X_{t+1} - X_t \mid X_0} + \Ew{ X_t \mid X_0}\\
& \eqnComment{(IH)}{\leq} \Ew{ X_{t+1} - X_t \mid X_0} + X_0 - t\delta \Pr{t \leq T}\\
& \eqnComment{($\ast$)}{\leq} -\delta\Pr{t \lt T} + X_0 - t\delta \Pr{t \leq T}\\
& \leq -\delta\Pr{t+1 \leq T} + X_0 - t\delta \Pr{t+1 \leq T}\\
& = X_0 - (t+1)\delta \Pr{t+1 \leq T}.
\end{align*}
This concludes the induction.
7.2 Variable Fixed Budget Drift
For the rest of this chapter, we want to generalize
Theorem 7.1 [Additive Fixed-Budget Drift] to
state-varying drift. Suppose that, for some function $h$, in state $x$ we observe a drift of $h(x)$. In order to understand what kind of result to expect in this context, we consider a completely deterministic process starting in $x_0 \in \realnum$ and progressing down by $h(x)$ when in state $x$. Then, after one step, the process is in $x_0 - h(x_0)$, after two steps in $x_0 - h(x_0) - h(x_0-h(x_0))$ and so on. We write, for all $x$, $\tilde{h}(x) = x - h(x)$. Thus we can write the sequence of states of the process as $x_0, \tilde{h}(x_0), \tilde{h}(\tilde{h}(x_0))$ and so on. We write $\tilde{h}^t$ for the $t$-fold application of $\tilde{h}$, so after $t$ steps of the process the state is $\tilde{h}^t$. Thus, we want a theorem that shows that we get a similar expected value for a probabilistic process.
The main question is now what we need to assume about $h$ to get a behavior similar to the deterministic process. Consider the following monotone process on $\{0,1,2\}$: $X_0$ is $2$ and the process moves to one of $\{0,1\}$ uniformly. State $0$ is the target state, from state $1$ there is only a very small probability to progress to $0$ (say $0.1$). Then it is better to stay in State $2$ instead of being trapped in State $1$. Here the drift is $1.5$ in State $2$ and only $0.1$ in State $1$. Thus, the
expected next state for State $2$ is $0.5$, which is less than the expected next state for State $1$, which is $0.9$! Intuitively, greedily going forward is a bad idea, if given the choice between States $1$ and $2$ one should choose (non-greedily) State $2$. It turns out that forbidding this kind of situation, formalized in the next definition, leads to a viable generalization of
Theorem 7.1 [Additive Fixed-Budget Drift].
Definition 7.3 [Greed-Admitting Functions].
We say that a drift function $h\colon S \to \realnum_{\gt 0}$ is greed-admitting if and only if $\mathrm{id} - h$ (the function $x \mapsto x - h(x)$) is monotone non-decreasing.
Intuitively, this formalizes the idea that being closer to the goal is always better (“greed is good”). The process described before the definition is, in a sense, badly designed: State $1$ is worse than State $2$, so it should not have a smaller value.
We now give two different versions of fixed-budget drift theorems. The first considers unlimited time, a very strong requirement, leading to a strong conclusion.
Theorem 7.4 [Variable Fixed-Budget Drift, Unlimited Time].
Let $(X_t)_{t \in \natnum}$, be an
integrable random process on $S \subseteq \realnum$, where $0 = \min S$. Let $h\colon S \to \realnum_{\geq 0}$ be a
twice differentiable, convex and greed-admitting function such that $\tilde{h}'(0) \in \; ]0,1]$ and we have the drift condition
- (D-ut)
- $\Ew{X_t - X_{t+1} \mid X_0,…, X_t} \geq h(X_t)$.
Define $\tilde{h}(x) = x - h(x)$. Thus, the drift condition is equivalent to
- (D-ut')
- $\Ew{X_{t+1} \mid X_0,…, X_t} \leq \tilde{h}(X_t)$.
Then, for all $t \geq 0$,
$$
\Ew{X_t \mid X_0} \leq \tilde{h}^t(X_0)
$$
and, in particular,
$$
\Ew{X_t} \leq \tilde{h}^t(\Ew{X_0}).
$$
Crucial for this theorem is that the drift condition is
unlimited time, by which we mean that the drift condition has to hold for all times $t$, not just (which is the typical case in the literature for drift theorems) those before the optimum is hit. This theorem is applicable if there is no optimum (and the optimization progresses indefinitely) or if the drift is $0$ in the optimum. In order to bypass these limitations we also give a variant which allows for
limited time drift, where the drift condition only needs to hold before the optimum is hit; however, in this case we pick up an additional error term in the result, derived from the possibility of hitting the optimum within the allowed time budget of $t$. Thus, in order to apply this theorem, one will typically need concentrations bounds for the time to hit the optimum.
A special case of the previous theorem is given in [
LS15Fixed Budget Performance of the (1+1) EA on Linear Functions. Johannes Lengler, Nicholas Spooner. FOGA 2015.], where the drift is necessarily multiplicative. Note that in this case we can typically consider unlimited time, since after reaching the state $0$ the multiplicative drift holds vacuously.
Now we give a version of the variable fixed-budget drift where the time is limited in the sense that the drift condition might no longer hold at some point in time.
Theorem 7.5 [Variable Fixed-Budget Drift, Limited Time].
Let $(X_t)_{t \in \natnum}$, be an
integrable random process on $S \subseteq \realnum$, where $0 = \min S$. Let $T = \inf\set{ t \in \natnum}{ X_t = 0 }$ and $h\colon S \to \realnum_{\geq 0}$ be a
twice differentiable, convex and greed-admitting function such that $\tilde{h}'(0) \in \; ]0,1]$ and we have, for all $t \lt T$, the drift condition
- (D-lt)
- $\Ew{X_t - X_{t+1} \mid X_0,…, X_t} \geq h(X_t)$.
Define $\tilde{h}(x) = x - h(x)$. Thus, the drift condition is equivalent to
- (D-lt')
- $\Ew{X_{t+1} \mid X_0,…, X_t} \leq \tilde{h}(X_t)$.
Then, for all $t \geq 0$,
$$
\Ew{X_t \mid X_0} \leq \tilde{h}^t(X_0) + \frac{\tilde{h}(0)}{\tilde{h}'(0)}
$$
and, in particular,
$$
\Ew{X_t} \leq \tilde{h}^t(\Ew{X_0}) - \frac{\tilde{h}(0)}{\tilde{h}'(0)} \cdot \Pr{t \geq T \mid X_0}.
$$
For both these theorems, the drift function bounding the drift has to be convex and greed-admitting, which intuitively says that being closer to the goal is always better in terms of the expected state after an additional iteration, while search points closer to the goal are required to have weaker drift. These conditions are fulfilled in many sample applications.
In order to interpret the conclusions of the last two theorems properly, we need to estimate the term $\tilde{h}^t$. With the following theorem we give a general way of making this estimation.
Theorem 7.6 [Estimation of Iterated Functions].
Let $h\colon \realnum \rightarrow \realnum_{+}$ be a monotone non-decreasing and
integrable function. Let $\tilde{h} = \mathrm{id} - h$. Then, for all starting points $n$ and all target points $x \leq y$ and all time budgets $t$,
$$
\mbox{if } t \geq \int_{x}^{y} \frac{1}{h(z)}\mathrm{d}z \;\;\mbox{ then }\;\; \tilde{h}^t(y) \leq x.
$$
We can specialize the previous theorem to the discrete case.
Theorem 7.7 [Estimation of Iterated Functions, Sum Formula].
Let $h\colon \natnum \rightarrow \realnum_{+}$ be a monotone non-decreasing function and let $\tilde{h} = \mathrm{id} - h$. Then, for all starting points $n \in \natnum$ and all target points $m \leq n$ and all time budgets $t$,
$$
\mbox{if } t \geq \sum_{i=m}^{n-1} \frac{1}{h(i)} \;\;\mbox{ then }\;\; \tilde{h}^t(n) \leq m.
$$
Proof.
We apply
Theorem 7.6 [Estimation of Iterated Functions] to $\overline{h}\colon \realnum \rightarrow \realnum_{\gt 0}, x \mapsto h(\max(0,\lfloor x \rfloor))$; note that we only care about non-negative arguments to $h$. Further, we use that, for all $i \in \natnum$,
$$
\frac{1}{h(i)} = \int_{i}^{i+1} \frac{1}{\overline{h}(z)}\mathrm{d}z.
$$
7.3 Applications to OneMax and LeadingOnes
In this section we show results from applications of
Theorem 7.4 [Variable Fixed-Budget Drift, Unlimited Time] as given in [
KW20Improved Fixed-Budget Results via Drift Analysis. Timo Kötzing, Carsten Witt. PPSN (2) 2020.]. We consider the optimization of the $(1+1)$ EA on
OneMax and on
LeadingOnes as examples. We start with
OneMax, where we have multiplicative drift.
Theorem 7.8 [Fixed Budget for OneMax].
For all $t \in \natnum$, let $X_t$ be the number of $1$s which the $(1+1)$ EA on ${\mathrm O{\scriptsize NE}M{\scriptsize AX}}$ has found after $t$ iterations of the algorithm. Then we have, for all $t$,
$$
\Ew{X_t} \geq
\begin{cases}
\frac{n}{2} + \frac{t}{2\sqrt{e}}-O(1), &\mbox{if }t = O(\sqrt{n});\\
\frac{n}{2} + \frac{t}{2\sqrt{e}}(1-o(1)), &\mbox{if }t = o(n).
\end{cases}
$$
Furthermore, for all $t$, we have $\Ew{X_t} \geq n(1 - \exp(-t/(en))/2)$.
For the $(1+1)$ EA on
OneMax, no concrete formula for a bound on the fitness value after $t$ iterations was known: The original work [
JZ12Fixed budget computations: a different perspective on run time analysis. Thomas Jansen, Christine Zarges. GECCO 2012.] could only handle RLS on
OneMax, not the $(1+1)$ EA. The multiplicative drift theorem of [
LS15Fixed Budget Performance of the (1+1) EA on Linear Functions. Johannes Lengler, Nicholas Spooner. FOGA 2015.] allows for deriving a lower bound of $n/2 + t/(2e)$ for $t = o(n)$, using a multiplicative drift constant of $(1-1/n)^n/n$. Since our drift theorem allows for variable drift, we can give the better bound of $n/2 + t/(2\sqrt{e}) - o(t)$ for the $(1+1)$ EA on
OneMax with $t = o(n)$. Note that [
LS15Fixed Budget Performance of the (1+1) EA on Linear Functions. Johannes Lengler, Nicholas Spooner. FOGA 2015.] also gives bounds for values of $t$ closer to the expected optimization time.
Our second example shows the progress of the $(1+1)$ EA on LeadingOnes, where we have additive drift. The result is summarized in the following theorem.
Theorem 7.9 [Fixed Budget for LeadingOnes].
For all $t \in \natnum$, let $X_t$ be the number of leading $1$s which the $(1+1)$ EA on
LeadingOnes has found after $t$ iterations of the algorithm. We have, for all $t$,
$$
\Ew{X_t} \geq
\begin{cases}
\frac{2t}{n} - O(1), &\mbox{if }t=O(n^{3/2});\\
\frac{2t}{n} \cdot (1 - o(1)), &\mbox{if }t=o(n^2);\\
n\ln(1+\frac{2t}{n^2}) - O(1), &\mbox{if }t \leq \frac{e-1}{2} n^2 - n^{3/2}.
\end{cases}
$$
For the $(1+1)$ EA on
LeadingOnes with a budget of $t = o(n^2)$ iterations, the paper [
JZ12Fixed budget computations: a different perspective on run time analysis. Thomas Jansen, Christine Zarges. GECCO 2012.] gives a lower bound of $2t/n - o(t/n)$ for the expected fitness after $t$ iterations, which are recovered with a simpler proof. The general theorems from this section also allow budgets closer to the expected optimization time, where we get a lower bound of $n\ln(1+2t/n^2) - O(1)$.
7.4 Bibliographic Remarks
The setting of fixed-budget analysis was introduced to the analysis of randomized search heuristics by Jansen and Zarges [
JZ12Fixed budget computations: a different perspective on run time analysis. Thomas Jansen, Christine Zarges. GECCO 2012.], who derived fixed-budget results for the classical example functions
OneMax and
LeadingOnes by bounding the expected progress in each iteration. A different perspective was proposed by Doerr, Jansen, Witt and Zarges [
DJWZ13A method to derive fixed budget results from expected optimisation times. Benjamin Doerr, Thomas Jansen, Carsten Witt, Christine Zarges. GECCO 2013.], who showed that fixed-budget statements can be derived from bounds on optimization times if these exhibit strong concentration. Lengler and Spooner [
LS15Fixed Budget Performance of the (1+1) EA on Linear Functions. Johannes Lengler, Nicholas Spooner. FOGA 2015.] proposed a variant of multiplicative drift for fixed-budget results and the use of differential equations in the context of
OneMax and general linear functions. Nallaperuma, Neumann and Sudholt [
NNS17Expected Fitness Gains of Randomized Search Heuristics for the Traveling Salesperson Problem. Samadhi Nallaperuma, Frank Neumann, Dirk Sudholt. Evol. Comput. 2017.] applied fixed-budget theory to the analysis of evolutionary algorithms on the traveling salesman problem and Jansen and Zarges [
JZ14Reevaluating Immune-Inspired Hypermutations Using the Fixed Budget Perspective. Thomas Jansen, Christine Zarges. IEEE Trans. Evol. Comput. 2014.] to artificial immune systems. The quality gains of optimal black-box algorithms on
OneMax in a fixed-budget perspective were analyzed by Doerr, Doerr and Yang [
DDY20Optimal parameter choices via precise black-box analysis. Benjamin Doerr, Carola Doerr, Jing Yang. Theor. Comput. Sci. 2020.]. He, Jansen and Zarges [
HJZ19Unlimited budget analysis. Jun He, Thomas Jansen, Christine Zarges. GECCO (Companion) 2019.] consider the so-called unlimited budgets to estimate fitness values in particular for points of time larger than the expected optimization time. A survey by Jansen [
Jan20Analysing Stochastic Search Heuristics Operating on a Fixed Budget. Thomas Jansen. In: Theory of Evolutionary Computation 2020.] summarizes the state of the art in the area of fixed-budget analysis.
In contrast to the numerous drift theorems available for bounding the optimization time, there was no corresponding theorem for making a fixed-budget analysis apart from one for the multiplicative case given in [
LS15Fixed Budget Performance of the (1+1) EA on Linear Functions. Johannes Lengler, Nicholas Spooner. FOGA 2015.]. This changed with [
KW20Improved Fixed-Budget Results via Drift Analysis. Timo Kötzing, Carsten Witt. PPSN (2) 2020.], introduced in this chapter, providing several such drift theorems.
Note that a further fixed-budget drift theorem can be found in [
KW20Improved Fixed-Budget Results via Drift Analysis. Timo Kötzing, Carsten Witt. PPSN (2) 2020.], where a detour of the computation of fixed budget results via first hitting times is made.
This work is licensed as Creative Commons Attribution - ShareAlike 4.0 International.
All mathematical statement in this script are freely usable, of course. As a collection and with comments they are licensed as stated above.