Implement simple genetic algorithms from scratch in Python

Genetic algorithm is a random global optimization algorithm. Together with artificial neural networks, it is probably one of the most popular and well known biologically inspired algorithms. The algorithm is an evolutionary algorithm that performs optimization processes inspired by evolutionary biology theory through natural selection, with a binary representation and simple operators based on genetic recombination and genetic mutation.

In this tutorial, you will discover genetic algorithm optimization algorithms. After completing this tutorial, you will know:

Genetic algorithm is a random optimization algorithm inspired by evolution.
How to implement genetic algorithms from scratch in Python.
How to apply genetic algorithm to continuous objective function.

Tutorial overview

This tutorial is divided into four parts. They are:

Genetic algorithm (ga)
A zero-based genetic algorithm
Genetic algorithm of OneMax
Genetic algorithm for continuous function optimization

Genetic algorithm (ga)

Genetic algorithm is a random global search optimization algorithm. It is inspired by the theory of evolutionary biology by natural selection. Specifically, a new comprehensive approach that combines understanding of genetics with theory.

The algorithm uses analogues of genetic representation (string), fitness (functional assessment), gene recombination (string crossing) and mutation (flipped bit). The algorithm works by first creating random bitstrings of fixed size. Repeat the main loop of the algorithm for a fixed number of iterations, or until no further improvement is seen in the best solution for a given number of iterations. An iteration of the algorithm is like an evolutionary generation.

First, the population bitstring (candidate solution) is evaluated using an objective function. The objective function evaluation of each candidate solution is considered to be the suitability of the solution, which can be minimized or maximized. Parents were then selected based on their health. A given candidate solution can be used as a parent zero or more times. A simple and effective selection method involves randomly selecting k candidates from the population and selecting members from the most adaptable group. This is known as tournament selection, where k is a hyperparameter and is set to a value such as 3. This simple approach simulates the more costly fitness to scale option.

Parents are used as the basis for generating the next generation of candidates, and every position in the population requires a parent.

The parents were then paired and used to create two children. Use crossover operators to perform recombination. This involves selecting a random split point on the bit string and then creating a child object with bits from the first parent to the split point and from the second parent to the end of the string. Then, reverse the process for the second child.

For example, two parents:

parent1 = 00000

parent2 = 11111

May result in two crossed children:

1 = 00011

Child 2 = 11100

This is called single point crossing, and operators have many other variations.

The crossover probability is applied to the per-parent probability, which means that in some cases the copy of the parent will act as the child rather than as a recombination operator. Crossover is controlled by hyperparameters set to large values, such as 80% or 90%. Variation involves flipping bits in the child candidate solution that has been created. Typically, the mutation rate is set to 1 / L, where L is the length of the bit string.

For example, if the problem uses a bit string with 20 bits, a good default mutation rate would be (1/20) = 0.05 or 5% probability.

This defines the simple genetic algorithm process. This is a large area of research, and many extensions have been made to the algorithm.

Now that we’re familiar with the simple genetic algorithm process, let’s look at how to implement it from scratch.

A zero-based genetic algorithm

In this section, we will develop an implementation of the genetic algorithm. The first step is to create a random bit string. We can use Boolean values True and False, string values “0” and “1”, or integer values 0 and 1. In this case, we will use integer values. We can use the randint () function to generate an array of integer values in a range, and the range can be specified as values starting at 0 and less than 2, such as 0 or 1. For simplicity, we also represent candidate solutions as lists rather than NumPy arrays. Initial random bitstring padding can be created as follows, where “n_POP” is the hyperparameter that controls the padding size and “n_bits” is the hyperparameter that defines the median of a single candidate solution:

# initial population of random bitstring
pop = [randint(0.2, n_bits).tolist() for _ in range(n_pop)]
Copy the code

Next, we can enumerate a fixed number of iterations of the algorithm, in this case controlled by a hyperparameter named “n_iter”.

. # enumerate generationsfor gen in range(n_iter):
 ...
Copy the code

The first step of algorithm iteration is to evaluate all candidate solutions.

We’ll use a function called Objective () as a generic target function and call it to get a fitness score, which we minimize.

# evaluate all candidates in the population
scores = [objective(c) for c in pop]
Copy the code

We can then select the parent that will be used to create the children.

The tournament selection process can be implemented as a function that takes the population and returns a selected parent. Use the default parameter to fix the value of k to 3, but you can try different values as needed.

# tournament selection
def selection(pop, scores, k=3):
 # first random selection
 selection_ix = randint(len(pop))
 for ix in randint(0.len(pop), k- 1):
 # check if better (e.g. perform a tournament)
 if scores[ix] < scores[selection_ix]:
 selection_ix = ix
 return pop[selection_ix]
Copy the code

We can then call this function once for each location in the population to create a list of parents.

# select parents
selected = [selection(pop, scores) for _ in range(n_pop)]
Copy the code

Then, we can create the next generation.

This first requires the implementation of interleaved functionality. This feature will occupy two parent levels and crossover rates. The crossover rate is a hyperparameter that determines whether the crossover is performed, and if not, the parent is copied to the next generation. This is a probability, usually with a large value close to 1.0.

The following crossover () function uses random numbers in the range [0,1] to determine if a crossover has been performed, and then selects a valid crossover point if one has been performed.

# crossover two parents to create two children
def crossover(p1, p2, r_cross):
 # children are copies of parents by default
 c1, c2 = p1.copy(), p2.copy()
 # check for recombination
 if rand() < r_cross:
 # select crossover point that is not on the end of the string
 pt = randint(1.len(p1)2 -)
 # perform crossover
 c1 = p1[:pt] + p2[pt:]
 c2 = p2[:pt] + p1[pt:]
 return [c1, c2]
Copy the code

We also need to perform mutant functions. The process simply reverses bits with low probability controlled by the “R_mut” hyperparameter.

# mutation operator
def mutation(bitstring, r_mut):
 for i in range(len(bitstring)):
 # check for a mutation
 if rand() < r_mut:
 # flip the bit
 bitstring[i] = 1 - bitstring[i]
Copy the code

We can then iterate over the parent list and create a child list to use as the next generation, calling crossover and mutator functions as needed.

# create the next generation
children = list()
for i in range(0, n_pop, 2):
 # get selected parents in pairs
 p1, p2 = selected[i], selected[i+1]
 # crossover and mutation
 for c in crossover(p1, p2, r_cross):
 # mutation
 mutation(c, r_mut)
 # store for next generation
 children.append(c)
Copy the code

We can combine all of this into a function called generic_algorithm (), which takes the name of the target function and the hyperparameters of the search and returns the best solution found during the search.

# genetic algorithm
def genetic_algorithm(objective, n_bits, n_iter, n_pop, r_cross, r_mut):
 # initial population of random bitstring
 pop = [randint(0.2, n_bits).tolist() for _ in range(n_pop)]
 # keep track of best solution
 best, best_eval = 0, objective(pop[0])
 # enumerate generations
 for gen in range(n_iter):
  # evaluate all candidates in the population
  scores = [objective(c) for c in pop]
  # check for new best solution
  for i in range(n_pop):
   if scores[i] < best_eval:
    best, best_eval = pop[i], scores[i]
    print(">%d, new best f(%s) = %.3f" % (gen,  pop[i], scores[i]))
  # select parents
  selected = [selection(pop, scores) for _ in range(n_pop)]
  # create the next generation
  children = list()
  for i in range(0, n_pop, 2):
   # get selected parents in pairs
   p1, p2 = selected[i], selected[i+1]
   # crossover and mutation
   for c in crossover(p1, p2, r_cross):
    # mutation
    mutation(c, r_mut)
    # store for next generation
    children.append(c)
  # replace population
  pop = children
 return [best, best_eval]
Copy the code

Now that we have developed an implementation of the genetic algorithm, let’s explore how to apply it to the objective function.

Genetic algorithm of OneMax

In this section, we apply genetic algorithms to binary string-based optimization problems. This problem is called OneMax and evaluates a binary string based on the number of 1s in the string. For example, a 20-bit string scores 20 for a string of all 1s. Assuming that we have implemented a genetic algorithm to minimize the objective function, we can add a negative sign to this assessment so that a large positive value becomes a large negative value. The onemax () function below does this and takes a bit string of integer values as input and returns the negative sum of the values.

# objective function
def onemax(x):
 return -sum(x)
Copy the code

Next, we can configure the search.

The search will run 100 iterations, and we will use 20 bits in the candidate solutions, which means the best fitness is -20.0.

The total population will be 100, and we will use a 90% crossover rate and a 5% mutation rate. This configuration was selected after some trial and error.

# define the total iterations
n_iter = 100
# bits
n_bits = 20
# define the population size
n_pop = 100
# crossover rate
r_cross = 0.9
# mutation rate
r_mut = 1.0 / float(n_bits)
Copy the code

The search can then be invoked and the best results reported.

# perform the genetic algorithm search
best, score = genetic_algorithm(onemax, n_bits, n_iter, n_pop, r_cross, r_mut)
print('Done! ')
print('f(%s) = %f' % (best, score))
Copy the code

Taken together, a complete example of applying the genetic algorithm to the OneMax objective function is listed below.

# genetic algorithm search of the one max optimization problem
from numpy.random import randint
from numpy.random import rand
 
# objective function
def onemax(x):
 return -sum(x)
 
# tournament selection
def selection(pop, scores, k=3):
 # first random selection
 selection_ix = randint(len(pop))
 for ix in randint(0.len(pop), k- 1):
  # check if better (e.g. perform a tournament)
  if scores[ix] < scores[selection_ix]:
   selection_ix = ix
 return pop[selection_ix]
 
# crossover two parents to create two children
def crossover(p1, p2, r_cross):
 # children are copies of parents by default
 c1, c2 = p1.copy(), p2.copy()
 # check for recombination
 if rand() < r_cross:
  # select crossover point that is not on the end of the string
  pt = randint(1.len(p1)2 -)
  # perform crossover
  c1 = p1[:pt] + p2[pt:]
  c2 = p2[:pt] + p1[pt:]
 return [c1, c2]
 
# mutation operator
def mutation(bitstring, r_mut):
 for i in range(len(bitstring)):
  # check for a mutation
  if rand() < r_mut:
   # flip the bit
   bitstring[i] = 1 - bitstring[i]
 
# genetic algorithm
def genetic_algorithm(objective, n_bits, n_iter, n_pop, r_cross, r_mut):
 # initial population of random bitstring
 pop = [randint(0.2, n_bits).tolist() for _ in range(n_pop)]
 # keep track of best solution
 best, best_eval = 0, objective(pop[0])
 # enumerate generations
 for gen in range(n_iter):
  # evaluate all candidates in the population
  scores = [objective(c) for c in pop]
  # check for new best solution
  for i in range(n_pop):
   if scores[i] < best_eval:
    best, best_eval = pop[i], scores[i]
    print(">%d, new best f(%s) = %.3f" % (gen,  pop[i], scores[i]))
  # select parents
  selected = [selection(pop, scores) for _ in range(n_pop)]
  # create the next generation
  children = list()
  for i in range(0, n_pop, 2):
   # get selected parents in pairs
   p1, p2 = selected[i], selected[i+1]
   # crossover and mutation
   for c in crossover(p1, p2, r_cross):
    # mutation
    mutation(c, r_mut)
    # store for next generation
    children.append(c)
  # replace population
  pop = children
 return [best, best_eval]
 
# define the total iterations
n_iter = 100
# bits
n_bits = 20
# define the population size
n_pop = 100
# crossover rate
r_cross = 0.9
# mutation rate
r_mut = 1.0 / float(n_bits)
# perform the genetic algorithm search
best, score = genetic_algorithm(onemax, n_bits, n_iter, n_pop, r_cross, r_mut)
print('Done! ')
print('f(%s) = %f' % (best, score))
Copy the code

Running the sample reports the best results found along the way, and then at the end of the search gives us the final best solution, which we hope is the best solution.

Note: Your results may differ due to randomness of the algorithm or evaluation program, or due to differences in numerical accuracy. Consider running the example a few times and comparing the average results.

In this case, we can see that the search found the best solution after about eight generations.

>0.new best f([1.1.0.0.1.1.1.0.1.1.0.1.0.1.1.1.0.1.1.1]) = 14.000
>0.new best f([1.1.0.1.1.1.1.1.1.1.1.1.1.1.0.1.0.0.1.0]) = 15.000
>1.new best f([1.1.1.0.1.1.1.1.1.1.1.1.1.1.0.0.1.0.1.1]) = 16.000
>2.new best f([0.1.1.1.1.1.1.0.1.1.1.1.1.0.1.1.1.1.1.1]) = 17.000
>2.new best f([1.1.1.1.0.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1]) = 19.000
>8.new best f([1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1]) = 20.000
Done!
f([1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1.1]) = 20.000000
Copy the code

Genetic algorithm for continuous function optimization

Optimizing OneMax functionality is not very interesting. We are more likely to want to optimize continuous functions. For example, we can define an x ^ 2 minimize function that takes an input variable and has an optimal value at f (0,0) = 0.0.

# objective function
def objective(x):
 return x[0] * *2.0 + x[1] * *2.0
Copy the code

We can use genetic algorithms to minimize this feature. First, we must define the bounds of each input variable.

# define range for input
bounds = [[5.0.5.0], [5.0.5.0]]
Copy the code

We take the “n_bits” hyperargument as the number of bits for each input variable of the target function and set it to 16 bits.

# bits per variable
n_bits = 16
Copy the code

This means that given two input variables, our actual bit string will have (16 * 2) = 32 bits.

# mutation rate
r_mut = 1.0 / (float(n_bits) * len(bounds))
Copy the code

Next, we need to ensure that the initial padding creates a random string large enough.

# initial population of random bitstring
pop = [randint(0.2, n_bits*len(bounds)).tolist() for _ in range(n_pop)]
Copy the code

Finally, we need to decode each bit string into a number before we can evaluate each bit string using the target function.

We can do this by first decoding each substring to an integer and then scaling the integer to the desired range. This provides a range of values that can then be fed to the target function for evaluation.

The decode () function below does this with the bounds of the function, the bits of each variable, and a bit string as input, and returns a list of decoded real values.

# decode bitstring to numbers
def decode(bounds, n_bits, bitstring):
 decoded = list()
 largest = 2**n_bits
 for i in range(len(bounds)):
  # extract the substring
  start, end = i * n_bits, (i * n_bits)+n_bits
  substring = bitstring[start:end]
  # convert bitstring to a string of chars
  chars = ' '.join([str(s) for s in substring])
  # convert string to integer
  integer = int(chars, 2)
  # scale integer to desired range
  value = bounds[i][0] + (integer/largest) * (bounds[i][1] - bounds[i][0])
  # store
  decoded.append(value)
 return decoded
Copy the code

We can then call it at the beginning of the algorithm loop to decode the population, and then evaluate the decoded version of the population.

# decode population
decoded = [decode(bounds, n_bits, p) for p in pop]
# evaluate all candidates in the population
scores = [objective(d) for d in decoded]
Copy the code

Taken together, a complete example of a genetic algorithm for continuous function optimization is listed below.

# genetic algorithm search for continuous function optimization
from numpy.random import randint
from numpy.random import rand
 
# objective function
def objective(x):
 return x[0] * *2.0 + x[1] * *2.0
 
# decode bitstring to numbers
def decode(bounds, n_bits, bitstring):
 decoded = list()
 largest = 2**n_bits
 for i in range(len(bounds)):
  # extract the substring
  start, end = i * n_bits, (i * n_bits)+n_bits
  substring = bitstring[start:end]
  # convert bitstring to a string of chars
  chars = ' '.join([str(s) for s in substring])
  # convert string to integer
  integer = int(chars, 2)
  # scale integer to desired range
  value = bounds[i][0] + (integer/largest) * (bounds[i][1] - bounds[i][0])
  # store
  decoded.append(value)
 return decoded
 
# tournament selection
def selection(pop, scores, k=3):
 # first random selection
 selection_ix = randint(len(pop))
 for ix in randint(0.len(pop), k- 1):
  # check if better (e.g. perform a tournament)
  if scores[ix] < scores[selection_ix]:
   selection_ix = ix
 return pop[selection_ix]
 
# crossover two parents to create two children
def crossover(p1, p2, r_cross):
 # children are copies of parents by default
 c1, c2 = p1.copy(), p2.copy()
 # check for recombination
 if rand() < r_cross:
  # select crossover point that is not on the end of the string
  pt = randint(1.len(p1)2 -)
  # perform crossover
  c1 = p1[:pt] + p2[pt:]
  c2 = p2[:pt] + p1[pt:]
 return [c1, c2]
 
# mutation operator
def mutation(bitstring, r_mut):
 for i in range(len(bitstring)):
  # check for a mutation
  if rand() < r_mut:
   # flip the bit
   bitstring[i] = 1 - bitstring[i]
 
# genetic algorithm
def genetic_algorithm(objective, bounds, n_bits, n_iter, n_pop, r_cross, r_mut):
 # initial population of random bitstring
 pop = [randint(0.2, n_bits*len(bounds)).tolist() for _ in range(n_pop)]
 # keep track of best solution
 best, best_eval = 0, objective(pop[0])
 # enumerate generations
 for gen in range(n_iter):
  # decode population
  decoded = [decode(bounds, n_bits, p) for p in pop]
  # evaluate all candidates in the population
  scores = [objective(d) for d in decoded]
  # check for new best solution
  for i in range(n_pop):
   if scores[i] < best_eval:
    best, best_eval = pop[i], scores[i]
    print(">%d, new best f(%s) = %f" % (gen,  decoded[i], scores[i]))
  # select parents
  selected = [selection(pop, scores) for _ in range(n_pop)]
  # create the next generation
  children = list()
  for i in range(0, n_pop, 2):
   # get selected parents in pairs
   p1, p2 = selected[i], selected[i+1]
   # crossover and mutation
   for c in crossover(p1, p2, r_cross):
    # mutation
    mutation(c, r_mut)
    # store for next generation
    children.append(c)
  # replace population
  pop = children
 return [best, best_eval]
 
# define range for input
bounds = [[5.0.5.0], [5.0.5.0]]
# define the total iterations
n_iter = 100
# bits per variable
n_bits = 16
# define the population size
n_pop = 100
# crossover rate
r_cross = 0.9
# mutation rate
r_mut = 1.0 / (float(n_bits) * len(bounds))
# perform the genetic algorithm search
best, score = genetic_algorithm(objective, bounds, n_bits, n_iter, n_pop, r_cross, r_mut)
print('Done! ')
decoded = decode(bounds, n_bits, best)
print('f(%s) = %f' % (decoded, score))
Copy the code

The run example reports the best decoding results and the best decoding solution at the end of the run.

In this case, we can see that the algorithm found an input very close to f (0.0, 0.0) = 0.0.

>0.new best f([0.785064697265625.0.807647705078125]) = 1.268621
>0.new best f([0.385894775390625.0.342864990234375]) = 0.266471
>1.new best f([0.342559814453125.0.1068115234375]) = 0.128756
>2.new best f([0.038909912109375.0.30242919921875]) = 0.092977
>2.new best f([0.145721435546875.0.1849365234375]) = 0.055436
>3.new best f([0.14404296875.0.029754638671875]) = 0.021634
>5.new best f([0.066680908203125.0.096435546875]) = 0.013746
>5.new best f([0.036468505859375.0.10711669921875]) = 0.012804
>6.new best f([0.038909912109375.0.099639892578125]) = 0.011442
>7.new best f([0.033111572265625.0.09674072265625]) = 0.010455
>7.new best f([0.036468505859375.0.05584716796875]) = 0.004449
>10.new best f([0.058746337890625.0.008087158203125]) = 0.003517
>10.new best f([0.031585693359375.0.008087158203125]) = 0.001063
>12.new best f([0.022125244140625.0.008087158203125]) = 0.000555
>13.new best f([0.022125244140625.0.00701904296875]) = 0.000539
>13.new best f([0.013885498046875.0.008087158203125]) = 0.000258
>16.new best f([0.011444091796875.0.00518798828125]) = 0.000158
>17.new best f([0.0115966796875.0.00091552734375]) = 0.000135
>17.new best f([0.004730224609375.0.00335693359375]) = 0.000034
>20.new best f([0.004425048828125.0.00274658203125]) = 0.000027
>21.new best f([0.002288818359375.0.00091552734375]) = 0.000006
>22.new best f([0.001983642578125.0.00091552734375]) = 0.000005
>22.new best f([0.001983642578125.0.0006103515625]) = 0.000004
>24.new best f([0.001373291015625.0.001068115234375]) = 0.000003
>25.new best f([0.001373291015625.0.00091552734375]) = 0.000003
>26.new best f([0.001373291015625.0.0006103515625]) = 0.000002
>27.new best f([0.001068115234375.0.0006103515625]) = 0.000002
>29.new best f([0.000152587890625.0.00091552734375]) = 0.000001
>33.new best f([0.0006103515625.0.0]) = 0.000000
>34.new best f([0.000152587890625.0.00030517578125]) = 0.000000
>43.new best f([0.00030517578125.0.0]) = 0.000000
>60.new best f([0.000152587890625.0.000152587890625]) = 0.000000
>65.new best f([0.000152587890625.0.0]) = 0.000000
Done!
f([0.000152587890625.0.0]) = 0.000000
Copy the code

Author: Yishui Hancheng, CSDN blogger expert, personal research interests: machine learning, deep learning, NLP, CV\

Blog: yishuihancheng.blog.csdn.net

Implement simple genetic algorithms from scratch in Python

Related Posts

JAVA yuan notes

I’m going to convert the cmakelists.txt project into.sln

Git branch management strategy