Analysis of the (1+1) EA on LeadingOnes with Constraints

Understanding how evolutionary algorithms perform on constrained problems has gained increasing attention in recent years. In this paper, we study how evolutionary algorithms optimize constrained versions of the classical LeadingOnes problem. We first provide a run time analysis for the classical (1+1) EA on the LeadingOnes problem with a deterministic cardinality constraint, giving $\Theta(n (n-B)\log(B) + n^2)$ as the tight bound. Our results show that the behaviour of the algorithm is highly dependent on the constraint bound of the uniform constraint. Afterwards, we consider the problem in the context of stochastic constraints and provide insights using experimental studies on how the ($\mu$+1) EA is able to deal with these constraints in a sampling-based setting.


INTRODUCTION
Evolutionary algorithms [6] have been used to tackle a wide range of combinatorial and complex engineering problems.Understanding evolutionary algorithms from a theoretical perspective is crucial to explain their success and give guidelines for their application.The area of run time analysis has been a major contributor to the theoretical understanding of evolutionary algorithms over the last 25 years [4,11,19].Classical benchmark problems such as One-Max and LeadingOnes have been analyzed in a very detailed way, showing deep insights into the working behaviour of evolutionary algorithms for these problems.In real-world settings, problems that are optimized usually come with a set of constraints which often limits the resources available.Studying classical benchmark problems even with an additional simple constraint such as a uniform constraint, which limits the number of elements that can be chosen in a given benchmark function, poses significant new technical challenges for providing run time bounds of even simple evolutionary algorithms such as the (1+1) EA.
OneMax and the broader class of linear functions [5] have played a key role in developing the area of run time analysis during the last 25 years, and run time bounds for linear functions with a uniform constraint have been obtained [7,17].It has been shown in [7] that the (1+1) EA needs exponential time optimize OneMax under a specific linear constraint which points to the additional difficulty which such constraints impose on the search process.Tackling constraints by taking them as additional objectives has been shown to be quite successful for a wide range of problems.For example, the behaviour of evolutionary multi-objective algorithms has been analyzed for submodular optimization problems with various types of constraints [20,21].Furthermore, the performance of evolutionary algorithms for problems with dynamic constraints has been investigated in [22,23].
Another important area involving constraints is chance constrained optimization, which deals with stochastic components in the constraints.Here, the presence of stochastic components in the constraints makes it challenging to guarantee that the constraints are not violated at all.Chance-constrained optimization problems [2,13] are an important class of the stochastic optimization problems [1] that optimize a given problem under the condition that a constraint is only violated with a small probability.Such problems occur in a wide range of areas, including finance, logistics and engineering [9,12,14,29].Recent studies of evolutionary algorithms for chance-constrained problems focused on a classic knapsack problem where the uncertainty lies in the probabilistic constraints [26,27].Here, the aim is to maximise the deterministic profit subject to a constraint which involves stochastic weights and where the knapsack capacity bound can only be violated with a small probability of at most .A different stochastic version of the knapsack problem has been studied in [16].Here profits involve uncertainties and weights are deterministic.In that work, Chebyshev and Hoeffding-based fitness functions have been introduced and evaluated.These fitness functions discount expected profit values based on uncertainties of the given solutions.
Theoretical investigations for problems with chance constraints have gained recent attention in the area of run time analysis.This includes studies for montone submodular problems [15] and special instances of makespan scheduling [24].Furthermore, detailed run time analyses have been carried out for specific classes of instances for the chance constrained knapsack problem [18,28].

Our contribution
In this paper, we investigate the behaviour of the (1+1) EA for the classical LeadingOnes problem with additional constraints.We first study the behaviour for the case of a uniform constraint which limits the number of 1-bits that can be contained in any feasible solution.Let  be the upper bound on the number of 1-bits that any feasible solution can have.Then the optimal solutions consists of exactly  leading 1s and afterwards only 0s.The search for the (1+1) EA is complicated by the fact that when the current solution consists of  <  leading 1s, additional 1-bits not contributing to the fitness score at positions  +2, . . .,  might make solutions infeasible.We provide a detailed analysis of such scenarios in dependence of the given bound .
Specifically, we show a tight bound of Θ( 2 +(−) log()) (see Corollary 6).Note that [7] shows the weaker bound of  ( 2 log()), which, crucially, does not give insight into the actual optimization process at the constraint.Our analysis shows in some detail how the search progresses.In the following discussion, for the current search point of the algorithm, we call the part of the leading 1s the head of the bit string, the first 0 the critical bit and the remaining bits the tail.While the size of the head is less than  − ( − ), optimization proceeds much like for unconstrained LeadingOnes; this is because the bits in the tail of size about 2( − ) are (almost) uniformly distributed, contributing roughly a number of  − many 1s additionally to the  − ( − ) many 1s in the head.This stays in sum (mostly) below the cardinality bound , occasional violations changing the uniform distribution of the tail to one where bits in the tail are 1 with probability a little less than 1/2 (see Lemma 3).
Once the threshold of  − ( − ) many 1s in the head is passed, the algorithm frequently runs into the constraint.For a phase of equal LeadingOnes value, we consider the random walk of the number of 1s of the bit string of the algorithm.This walk has a bias towards the bound  (its maximal value), where the bias is light for LeadingOnes-values just a bit above  − ( − ) and getting stronger as this value approaches .Since progress is easy when not at the bound of  many 1s in the bit string (by flipping the critical bit and no other) and difficult otherwise (additionally to flipping the critical bit, a 1 in the tail needs to flip), the exact proportion of time that the walk spends in states of less than  versus exactly  many 1s is very important.In the final proofs, we estimate these factors and have corresponding potential functions reflecting gains (1) from changing into states of less than  many 1s and (2) gaining a leading 1.Bounding these gains appropriately lets us find asymptotically matching upper and lower bounds using the additive drift theorem [10].
In passing we note that two different modifications of the setting yield a better time of  ( 2 ).First, this time is sufficient to achieve a LeadingOnes-values of  −  ( − ) for any  > 0 (see Corollary 7).Second, considering the number of 1s as a secondary objective (to be minimized) gives an optimization time of  ( 2 ) (see Theorem 8).
Afterwards, we turn to stochastic constraints and investigate an experimental setting that is motivated by recent studies in the area of chance constraints.We consider LeadingOnes with a stochastic knapsack chance constraint, where the weights of a linear constraint are chosen from a given distribution.In the first setting, the weight of each item is chosen independently according to a Normal distribution  (,  2 ).A random sample of weights is feasible if the sum of the chosen sampled weights does not exceed a given knapsack bound .In any iteration, all weights are resampled independently for all evaluated individuals.Our goal is to understand the maximal stable LeadingOnes value that the algorithm obtains.
In the second setting which we study empirically, the weights are deterministically set to 1 and the bound is chosen uniformly at random within an interval [ − ,  + ], where  > 0 specifies the uncertainty around the constraint bound.For both settings, we examine the performance of the (1 + 1) EA and (10 + 1)-EA for different values of  and show that a larger parent population has a highly positive effect for these stochastic settings.
The paper is structured as follows.In Section 2, we introduce the problems and algorithms that we study in this paper.We present our run time analysis for the LeadingOnes problem with a deterministic uniform constraint in Section 3. In section 4, we discuss a way to obtain Θ( 2 ) bound on the run time for the same problem and report on our empirically investigations for the stochastic settings in Section 5. Finally, we finish with some concluding remarks.Note that some proofs are ommitted due to space constraints.

PRELIMINARIES
In this section we define the objective function, constraints and the algorithms used in our analysis.With | | 1 we denote the number of 1s in a bit string  ∈ {0, 1}  .

Objective Function
We consider the LeadingOnes function as our objective with cardinality and stochastic constraints for our analysis.

(𝜇+1) EA
The (+1) EA on a real valued fitness function  with constraint  is given in Algorithm 1.The (+1) EA at each iteration maintains a population of size .The initial population  0 has  random bit strings chosen uniformly.At each iteration  > 0, a bit string is chosen uniformly at random from   followed by a mutation operation which flips each bit of the chosen bit string with probability 1   .The mutated bit string is added to   and the bit string with the least fitness among the  + 1 individuals is removed.Since we can also sample a bit string which violates the constraint, we consider the following function for optimization.

UNMODIFIED SETTING
In this section we give a tight analysis of the (1+1) EA on the objective LeadingOnes with cardinality constraint .
We start with a technical lemma which we need for our proof of the upper bound.Lemma 1.For  ≥ 0, let   denote the parent bit string at -th iteration while (1+1) EA is optimizing LeadingOnes with the cardinality constraint B. And for  > 0, let   denote the event that Proof.First note that, if |  | 1 =  <  and   denote the event that   +1 is formed by flipping  −  number of 0 bits to 1 out of  −  − 1 (except the left most 0) number of 0 bits, then The event   is a sub-event of   , since in the event   we do not have any restriction on the bits other than  −  number of 0 bits out of  −  − 1 number of them and we have to flip at least  −  number of 0 bits to 1 to get the desired   +1 in the event   .Hence, The last inequality holds because, for every  > 0, − −  ≤ 1. □ In the Theorem 2 below we give an upper bound on the expected run time of the (1+1) EA on LeadingOnes with cardinality constraint .Later we show that this bound is tight by proving a matching lower bound.Theorem 2. Let ,  ∈ N and  < .Then the expected optimization time of the (1+1) EA on LeadingOnes with cardinality constraint  is   2 + ( − ) log  .
Proof.From [8, Lemma 3], we know that the (1+1) EA is expected to find a feasible solution within  ( log(/)) iterations.Now we calculate how long it takes in expected value to find the optimum after a feasible solution is sampled.
Let   (0) = 0 and  < (0) =   .And for every  ∈ {1, • • • , }, let and for every For  > 0, let   be the parent bit string of (1+1) EA at iteration . and let  be the iteration number at which (1+1) EA finds the optimum for the first time.Let We consider two different cases, |  | 1 =  and |  | 1 <  and show in both the cases the drift is at least 1.Suppose we are in an iteration  <  with  (  ) =  and |  | 1 = .Then the probability that the number of 1s in the search point can decrease by 1 in the next iteration is at least −  .This is because we can get a desired search point by flipping only one of the 1 bits of  − , excluding the leading 1s, and not flipping any other bit.Therefore, Suppose we are in an iteration  <  with  (  ) =  and |  | 1 < .
Then in the next iteration the value of LeadingOnes can increase when the leftmost 0 is flipped to 1 as this does not violate the constraint.This happens with probability at least 1  .Since |  | 1 < , we can also stay in the same level (same number of leading 1s) and the number of 1s can increase to  with probability at most −  (see Lemma 1).This implies that the potential can decrease by  − with probability at most −  .
This results in an expected additive drift value greater than 1 in all the cases, so according to the additive drift theorem [10,Theorem 5], =  ( 2 + ( − ) log ).

□
We now turn to the lower bound.When (1+1) EA optimizes LeadingOnes in unconstrained setting the probability that a bit which is after the left-most 0 is 1 is exactly 1  2 .But this is not true in the constrained setting.The following lemma gives an upper bound on this probability during the cardinality constraint optimization.Lemma 3.For any  ≥ 0, let   denote the search point at iteration  when (1+1) EA is optimizing LeadingOnes with the cardinality constraint .Then for any  ≥ 0 and  >  (  ),  (   = 1) ≤ 1/2.
Proof.We will prove this by induction.The base case is true because we have an uniform random bit string at  = 0. Lets assume that the statement is true for , i.e. for any  >  (  ),  (   = 1) ≤ 1/2.Let  be the event that the offspring is accepted.Then, for  >  (  +1 ), Let  (   = 1) = ,  ( ( ℎ bit is flipped ∩    = 0)) =  and  ( ( ℎ bit is flipped ∩    = 1)) = .Then note that  ≤  (because we have at least as many events as in probability  contributing to the probability ) and by induction hypothesis,

□
We use the previous lemma to prove the Ω( 2 ) lower bound on the expected time in the next theorem.Proof.We use the fitness level method with visit probabilities technique defined in [3,Theorem 8] to prove this lower bound.Similar to [3,Theorem 11], we also partition the search space {0, 1}  based on the LeadingOnes values.For every  ≤ , let   contain all the bit strings with the LeadingOnes value .If our search point is in   , then we say that the search point is in the state .For every  ∈ {1, • • • ,  − 1}, we have to find the visit probabilities   and an upper bound for   , the probability to leave the state .
The best case scenario for the search point to leave the state  is when the number of 1s in the search point is less than .In this case, we have to flip the ( + 1) ℎ bit to 1 and should not flip any of the first  bits to 0. This happens with the probability 1 Therefore, for every . Next, we claim that, for each  ∈ {1, • • • ,  − 1},   -the probability to visit the state  is at least 1  2 .We use [3,Lemma 10] to show this.Suppose the initial search point is in a state greater than or equal to , then the probability for it to be in state  is equal to the probability that the ( + 1) ℎ bit is 0. Since the initial bit string is chosen uniformly at random the probability that the ( + 1) ℎ bit is 0 is 1  2 .This shows the first required bound on the probability for the lemma in [3, Lemma 10].Suppose the search point is transitioning into a level greater than or equal to , then the probability that it transition into state  is equal to the probability that ( + 1) ℎ bit is 0. From Lemma 3, we know that this probability is at least 1/2.This gives the second bound required for the [3, Lemma 10], therefore   is at least 1  2 .By using fitness level method with visit probabilities theorem [3, Theorem 8], if  is the time taken by the (1+1) EA to find an individual with  number of LeadingOnes for the first time then, we have, We aim to show the Ω( 2 + ( − ) log ) lower bound and Theorem 4 gives the Ω( 2 ) lower bound.Therefore, next we consider the case where  is such that ( − ) log  ≠  ( 2 ) to prove the desired lower bound.Theorem 5. Let ,  ∈ N and suppose ( − ) log  =  ( 2 ).Then the expected optimization time of the (1+1) EA on the objective LeadingOnes with cardinality constraint  is Ω (( − ) log ) .
Proof.We consider the potential function  such that, for all  ∈ {0, 1}  , The first term appreciates progress by reducing the number of 1s.This is scaled to later derive constant drift in expectation from such a reduction whenever | | 1 = , the case where progress by increasing the number of leading 1s is not easy.The second term appreciates progress by increasing the number of leading 1s, scaled to derive constant drift in case of | | 1 < .
The idea of the proof is as follows.We show that the potential decreases by at most 10 in expectation.Then the lower bound of additive drift theorem will give the desired lower bound on the expected run time (see [10,Theorem 5]).
We start by calculating the expected potential at  = 0. Since the initial bit string is chosen uniformly at random the probability that the first bit is 0 is 1  2 .Therefore  ( ( 0 ) = 0) = 1 2 , which implies .
Therefore, there exits a constant  > 0 such that  [( 0 )] ≥ ( − ) log .The optimum has a potential value of ; thus, we can find a lower bound on the optimization time by considering the time to find a potential value of at most .Let  = min{ ≥ 0 | (  ) ≤ }.Note that  may not be the time at which we find the optimum for the first time.From ( − ) log  =  ( 2 ) we get, for  large enough, that  [( 0 )] > , which implies that the expected optimization time is at least  [ ].
In order to show the lower bound on the drift, we consider two different cases, |  | 1 =  and |  | 1 <  and show in both the cases drift is at most 10.First, we examine the case where the algorithm has currently  number of 1s.For any , let   be the event that Now we calculate the bounds for all the required expectations in the above equation.
First we calculate a bound for  • ( We used the infinite sum values ∞ =1 1 We calculate an upper bound for .The probability that  (  +1 ) −  (  ) =  given that we gain at least a leading one is the probability that next  − 1 bits after left-most 0 bit) is 1 followed by a 0 bit.This implies that the probability that  (  +1 ) −  (  ) =  given that we gain at least a leading one is at most 1  2  −1 .Therefore, we have Equations 3 and 4 imply that, We used the infinite sum values ∞ =1 2  = 6, to bound our required finite sums in the above calculation.
From Equations 2 and 5, we have  [Δ    ] ≤ 10 which concludes the first case (when |  | 1 = ).Next we calculate the bound for the drift conditioned on the event    (when Similar to the previous case, for this case also we start by finding a bound for  • Now we find upper bounds for both the quantities in the above equation.By doing calculations similar to the calculations which lead to the Equation ( 2), we get  • Since there are at least  −  number of 0 bits, the probability to gain a 1 bit is at least −  .And the probability that  (  ) =  (  +1 ) is at least 1 2 , for  large enough.Therefore, • 2 2 (− (  )+1) .By combining these two bounds we have .
Next we calculate  [Δ      ], to get an upper bound for  [Δ     ].When |  | 1 < , the probability to gain in LeadingOnes-value is at most 1  .Therefore, Since  −  (  ) ≥ 1, we have Proof.From Theorem 4 and Theorem 5 we have the required lower bound and we have the upper bound from Theorem 2. Therefore the expected optimization time is Θ  2 + ( − ) log  .□

BETTER RUN TIMES
In this section we discuss two ways to obtain the (optimal) run time of  ( 2 ).First, we state a corollary to the proof of Theorem 2, that we can almost reach the bound within  ( 2 ) iterations.Then (1+1) EA takes Θ( 2 ) in expectation to optimize  in the lexicographic order with the cardinality constraint .
Let  = min{ ≥ 0 (  ) ≥ 3 +  − }.We will examine the drift at two different scenarios, |  | 1 <  and |  | 1 =  and show that in both the cases the drift is at least 1/.Let Δ  = (  +1 )−(  ) and   be the event that the left-most 0 in   is flipped.Then  [Δ     ] ≥ 0, because, if the number of LeadingOnes does not increase then |  +1 | 0 − |  | 0 ≥ 0 which in turn implies Δ  ≥ 0. Therefore, for any 0 ≤  <  ,  .And The expected number of 0s in the initially selected uniform random bit string is  2 and the expected number of LeadingOnes is at least zero, therefore  [( 0 )] ≥  2 .We have an drift of at least 1  in both the cases, therefore we get the required upper bound by the additive drift theorem [10, Theorem 5], This proves the upper bound.And the lower bound follows from Theorem 4. □

EMPIRICAL ANALYSIS
We want to extend our theoretical work on deterministic constraint the case of stochastic constraint models (as defined in Section 2. 3) has variance 1.For both the models we considered two different  values 75 and 95 (also  = 85 in the Appendix).As we will see, the (1+1) EA struggles in these settings; in order to show that already a small parent population can remedy this, we also consider the (10 + 1) EA in our experiments.
We use the following lemma for discussing certain probabilities in this section.In Figure 1 we have a single sample run of (1+1) EA on the first model.We observe that if the (1+1) EA finds a bit string with  number of 1s it violates the constraint with probability 1  2 (see Lemma 9) and accepts a bit string with a lower number of LeadingOnes.This process keeps repeating whenever the (1+1) EA encounters an individual with a number of 1s closer to .

CONCLUSIONS
Understanding how evolutionary algorithms deal with constrained problems is an important topic of research.We investigated the classical LeadingOnes problem with additional constraints.For the case of a deterministic uniform constraint we have carried out a rigorous run time analysis of the (1+1) EA which gives results on the expected optimization time in dependence of the chosen constraint bound.Afterwards, we examined stochastic constraints and the use of larger populations for dealing with uncertainties.Our results show a clear benefit of using the (10 + 1) EA instead of the (1 + 1) EA.We regard the run time analysis of population-based algorithms for our examined settings of stochastic constraints as an important topic for future work.

Theorem 4 .
Let ,  ∈ N. Then the expected optimization time of the (1+1) EA on the LeadingOnes with cardinality constraint  is Ω  2 .

3 .
2).For the first model we use parameters  = 1 and  = 0.1 and for the second model we use  = √ Note that in the second model  ( − √ 3,  + √

Figures 2 2 √
Figures 2 and 3 are about the first model in which we have the LeadingOnes-values of the best individual (bit string with the maximum fitness value) in each iteration of the (10+1) EA, the LeadingOnes values of the second-worst individuals (bit string with the second-smallest fitness value) in each iteration of the (10+1) EA and the LeadingOnes values at each iteration of the (1+1) EA.Each curve is the median of thirty independent runs and the shaded

Figures 4
Figures 4 and 5 are about the second model and the curves represent the same things as in the previous figures but with respect to the second model.In these figures we can see that the best and the second worst individuals of the (10+1) EA are not the same because of the changing constraint values.

1 2
= 2, to bound our required finite sums in the above calculation.Now, we calculate  [Δ     ], to get an upper bound for  [Δ    ].When |  | 1 = , the probability to gain in the LeadingOnesvalues is at most