Evol is clear dsl for composable evolutionary algorithms that optimised for joy.

Overview

Documentation StatusDownloads Build Status Documentation StatusDownloads

Imgur

Evol is clear dsl for composable evolutionary algorithms that optimised for joy.

Installation

We currently support python3.6 and python3.7 and you can install it via pip.

pip install evol

Documentation

For more details you can read the docs but we advice everyone to get start by first checking out the examples in the /examples directory. These stand alone examples should show the spirit of usage better than the docs.

The Gist

The main idea is that you should be able to define a complex algorithm in a composable way. To explain what we mean by this: let's consider two evolutionary algorithms for travelling salesman problems.

The first approach takes a collections of solutions and applies:

  1. a survival where only the top 50% solutions survive
  2. the population reproduces using a crossover of genes
  3. certain members mutate
  4. repeat this, maybe 1000 times or more!

Drawing

We can also think of another approach:

  1. pick the best solution of the population
  2. make random changes to this parent and generate new solutions
  3. repeat this, maybe 1000 times or more!

Drawing

One could even combine the two algorithms into a new one:

  1. run algorithm 1 50 times
  2. run algorithm 2 10 times
  3. repeat this, maybe 1000 times or more!

Drawing

You might notice that many parts of these algorithms are similar and it is the goal of this library is to automate these parts. We hope to provide an API that is fun to use and easy to tweak your heuristics in.

A working example of something silimar to what is depicted above is shown below. You can also find this code as an example in the /examples/simple_nonlinear.py.

import random
from evol import Population, Evolution

random.seed(42)

def random_start():
    """
    This function generates a random (x,y) coordinate
    """
    return (random.random() - 0.5) * 20, (random.random() - 0.5) * 20

def func_to_optimise(xy):
    """
    This is the function we want to optimise (maximize)
    """
    x, y = xy
    return -(1-x)**2 - 2*(2-x**2)**2

def pick_random_parents(pop):
    """
    This is how we are going to select parents from the population
    """
    mom = random.choice(pop)
    dad = random.choice(pop)
    return mom, dad

def make_child(mom, dad):
    """
    This function describes how two candidates combine into a new candidate
    Note that the output is a tuple, just like the output of `random_start`
    We leave it to the developer to ensure that chromosomes are of the same type
    """
    child_x = (mom[0] + dad[0])/2
    child_y = (mom[1] + dad[1])/2
    return child_x, child_y

def add_noise(chromosome, sigma):
    """
    This is a function that will add some noise to the chromosome.
    """
    new_x = chromosome[0] + (random.random()-0.5) * sigma
    new_y = chromosome[1] + (random.random()-0.5) * sigma
    return new_x, new_y

# We start by defining a population with candidates.
pop = Population(chromosomes=[random_start() for _ in range(200)],
                 eval_function=func_to_optimise, maximize=True)

# We define a sequence of steps to change these candidates
evo1 = (Evolution()
       .survive(fraction=0.5)
       .breed(parent_picker=pick_random_parents, combiner=make_child)
       .mutate(func=add_noise, sigma=1))

# We define another sequence of steps to change these candidates
evo2 = (Evolution()
       .survive(n=1)
       .breed(parent_picker=pick_random_parents, combiner=make_child)
       .mutate(func=add_noise, sigma=0.2))

# We are combining two evolutions into a third one. You don't have to
# but this approach demonstrates the flexibility of the library.
evo3 = (Evolution()
       .repeat(evo1, n=50)
       .repeat(evo2, n=10)
       .evaluate())

# In this step we are telling evol to apply the evolutions
# to the population of candidates.
pop = pop.evolve(evo3, n=5)
print(f"the best score found: {max([i.fitness for i in pop])}")

Getting Started

The best place to get started is the /examples folder on github. This folder contains self contained examples that work out of the box.

How does it compare to ...

  • ... deap? We think our library is more composable and pythonic while not removing any functionality. Our library may be a bit slower though.
  • ... hyperopt? Since we force the user to make the actual algorithm we are less black boxy. Hyperopt is meant for hyperparameter tuning for machine learning and has better support for search in scikit learn.
  • ... inspyred? The library offers a simple way to get started but it seems the project is less actively maintained than ours.
Comments
  • How to set up evolution steps

    How to set up evolution steps

    The evolution will contain a chain of steps, which can in sequence be applied to a population. In principle these steps will need to be no more than functions - we could simply do

    for step in self.chain:
        population = step(population)
    

    These step functions would simply look like:

    def step(population):
        return population.evaluate()
    

    but we shouldn't replace them by lambda's for the sake of pickling.

    In order to provide arguments to the methods of the population inside such functions we could re-define the functions with the proper arguments for each time we use them.

    Now on the other hand it would be nice to be able to give a name to such a step, for example to identify the step in logging or to make it easier to debug. We could achieve that by creating a namedtuple with name and function fields. Once we have a namedtuple, we could decide to put the arguments not baked into the function, but in their own field. The application of the steps would then look like

    for step in self.chain:
        population = step.function(population, **step.kwargs)
    

    We can take it one step further by defining an EvolutionStep class, and make classes for each of the step types inheriting from that class. This would look like:

    class EvolutionStep:
    
        def __init__(self, name, **kwargs):
            self.name = name
            self.kwargs = kwargs
    
    
    class EvaluationStep(EvolutionStep):
    
        def __init__(self, name, lazy=False):
            EvolutionStep.__init__(self, name=name, lazy=lazy)
    
        def apply(self, population):
            return population.evaluate(**self.kwargs)
    

    which has the very nice property that the arguments are well taken care off as the function is only defined once (as a method to the step class). In this case the Evolution.evaluate method can look as simple as

    def evaluate(self):
         result = copy(self)
         result.chain.append(EvolutionStep())
         return result
    

    On the other hand we will need to define a separate Step class for each operation.

    @koaning, any preference?

    opened by rogiervandergeer 16
  • Islands

    Islands

    I've created a pull request for the islands (#37), but we still need to do some work before we can merge this. In this issue I'll explain some of the changes I've made, and I have a few design decisions to make.

    Let's start at the beginning; we have a Population, which is a collection of Individuals on which we can perform a multitude of operations (evaluate, mutate, etc). The Population can also take an Evolution object and perform a series of such operations.

    Now we want to introduce islands, where the population is split into several subgroups, on each of which we can perform the same operations as on the Population. So we add two (or three, see below) operations: split and combine in which we can split a population in multiple islands and then merge these islands back into a single population.

    I have assumed that we will always want the islands to go through the same evolution. I.e. if we split a population in two islands, and then we apply an evolution on those islands, that the evolution will be the same for both islands. Note that we can pass an island-id to the combiner, picker and mutator functions such that these behave differently on each island - but we assume that the order of the steps is the same.

    Question 1: is that a reasonable assumption?

    If we will always apply an operation to all islands in the same time, then the interface to the islands will be the same as to the Population: calling mutate on a group of islands will simply call mutate on each island. The same goes for all other operations (except split and combine). Hence a group of islands behaves exactly the same as a Population, and we can apply the same Evolution on each (as long as we do not use split or combine). Since the interface to a group of islands is so close to that of a Population, it makes sense to name the group of islands an IslandPopulation.

    Now we have Population, which is a collection of Individuals, and IslandPopulation, which has the same interface as Population (except, again, for split and combine) and which is a collection of Populations. Since Population and IslandPopulation have mostly the same interface it makes sense for them to be of the same type, but as IslandPopulation contains Populations, they cannot inherit from each other. Therefore I've created an abstract base class called PopulationBase of which both the Population and the IslandPopulation inherit. (And since ContestPopulation inherits from Population, also that inherits from PopulationBase.) The great thing about this is that IslandPopulation no longer has to be a collection of Population objects per se, but can be a collection of PopulationBase objects - making it a collection of any type of population.

    The above means that operations like mutate and evaluate can be performed on any object of type PopulationBase. But I've had to make exceptions for split and combine, since these transform a Population into an IslandPopulation and vice versa. For example, calling split on a Population should clearly result in an IslandPopulation, while calling combine on an IslandPopulation should result in a Population (or ContestPopulation if it was a collection of those). Naturally, Population.combine should not exist (or throw an error, or do nothing). And now comes the difficult part:

    Question 2: what should IslandPopulation.split do? Should it

    1. Refuse to work, just like you cannot combine a Population.
    2. Pass the split to each island, returning now the same IslandPopulation, containing not Populations but IslandPopulations . E.g. A[1, 2, 3] -> A[1[a, b], 2[c, d], 3[e, f]]
    3. Split itself, such that it returns _a new IslandPopulation, containing islands which contain the original Populations. E.g. A[1, 2, 3, 4] -> q[A[1, 2], B[3, 4]]
    4. Allow both options 2 and 3 using for example a flag pushdown or level.
    5. Split the internal populations such that the original IslandPopulation now contains more islands. E.g. A[1, 2] -> A[1a, 1b, 2a, 2b].

    If we choose for one of the options 2-4 we will have to deal with many layers of sub-islands. This makes functions like combine much more complicated, since we would have to specify which layer to combine (or always push down as far as possible) and verify that that layer exists. Also, if we allow sub-islands, we need to make sure that the depth of each branch stays the same. If someone manipulates an IslandPopulation to look like this: A[1, 2[c, d]], then what do we do if he requests the second layer to be combined?

    Next, we need to decide the workings of split. Intuitively I find that split should split the population in several groups, such that each individuals ends up in only one of them. I noticed that @koaning expects each individual to end up in each group, effectively duplicating the original Population. Since these are two fundamentally different operations, I propose we make both split() and duplicate() available.

    Question 3: do you agree?

    This does mean that we have to keep track of the original intended_size somehow. For example, suppose I have a Population of size 100. If I duplicate this into four groups, I end up with an IslandPopulation consisting of four islands with 400 individuals total. However, if I split it into four groups, I get an IslandPopulation consisting of four each with 25 individuals each, and hence 100 total. If we recombine these, we want to obtain a Population with intended_size == 100. Of course, when we combine the duplicated islands we will have an initial population of size 400, but this should be reduced back to 100 (either immediately or after the first survive, breed steps).

    For the duplicated islands, it is clear that the intended_size of each island is 100. For the split islands, since their initial size is 25, this is not so clear.

    Question 4: which is the best option:

    1. When using split, the intended size of each island is a fraction of the original (25 in the example), and when we use duplicate the intended size of each island is the same as the original (100 in the example). We keep track of the original intended size of the whole population and use that when we combine.
    2. When using split or duplicate, the intended size of the islands is a fraction of the original (25 in the example). In the case of duplicate, each island will initially be much larger than intended.
    3. Same as 2, but in the case of duplicate, we first use survive to cut the original population back to 25 before we duplicate.
    4. When using split or duplicate, the intended size of the islands is the same as the original (100). In the case of split each island will initially be much smaller than intended.
    5. Same as 4, but in the case of split we call breed immediately after the split such that all populations are of the right size.

    Of course we may allow the user to override the intended size at any point.

    @koaning I'd love to hear your opinion

    opened by rogiervandergeer 13
  • Thoughts on Parallelism

    Thoughts on Parallelism

    We already apply some performance tricks with the .evaluate() mechanic but we may be able to add some form of parallelism/queing to perhaps make things even more performant.

    In terms of easy win: it seems like the .map (and thus mutate) can be run in parallel in general. Same would hold for .evaluate() in the BasePopulation.

    Do we want to explore this?

    opened by koaning 11
  • Keeping track of the fittest individual

    Keeping track of the fittest individual

    In addition to logging (as discussed in #15) and checkpointing (as discussed in #33) I think we need to keep track of the best-performing individual. I think we need to do this separately from logging for two reasons:

    • the best individual is something you always want to find, irregardless of whether you want to log,
    • if logging is going to happen at an interval other than once every iteration, there is a chance that you do not log the best performing individual when it mutates before the logging moment.

    We could implement this by storing a single individual inside the Population. I would suggest to make this part of the evaluate call, and always (whether the call is lazy or not) replace the current best by an individual when it is evaluated and its fitness score is better than the current best.

    Naturally the result of this is nonsense for the ContestPopulation, as there the fitness depends on the rest of the population too. In this case it would only make sense to store the entire population together with the fittest individual, otherwise it would be impossible to recompute the result. Of course we cannot store populations inside populations, and this would be a typical case to be solved by logging.

    Even in a normal Population, for stochastic evaluation functions the 'best individual' may of course be a lucky shot; but I think this is not a problem for us to solve.

    Technically speaking, to me it feels like we want to store the best individual in a variable named something like best or historical_best. Currently we have min_individual and max_individual - and I don't recall why we didn't implement a current_best (or best). @koaning do you remember the reason?

    opened by rogiervandergeer 9
  • Cost functions

    Cost functions

    Would we want to have problem instances to be featured in our library as well? It may help people to get started. Say to have cost functions for:

    • easy/hard continous search spaces
    • basic TSP problems
    • 8-queen-like problems
    opened by koaning 8
  • Logging

    Logging

    @rogiervandergeer i am creating this pull request merely to get your attention and to shoot at some things. the point is that i think parts are correct, but a discussion may be helpful now.

    ive stumbled on some things that need discussion along with logging.

    1. i am proposing that we pass an evol.Logger object to the population apon initialisation. if you do not pass anything in then we will use the baselogger
    2. if you call .log() via the population/evolution then the logger that is passed in initialisation will determine what and how to log.
    3. you can see in travis what the output of a baselogger is, the current idea is to print to standardout if no file is given. it will log to a file instead if a file is given.
    4. the evol.Logger object is split in two verbs: .log() and .handle(). the idea is that the former will deal with the what to handle and the latter will deal with the how to handle it/where to put it. this should leave it free for anybody to write a custom logger but allows us to come up with a few basic ones that are general enough. we can have a BaseLogger to just log the population performance but offer a PerformanceLogger that logs the difference in time between two log moments.
    5. i am currently logging the datetime, an id for the population, an id for the individual, the fitness of every individual and the chromosome. is this well enough?
    6. there is a todo at line 202 in population.py -> do we still want to keep track of a populations/individuals age? i think we don't need it, it may be nicer to keep track of a generation via the logging object (ie. we keep track of how often .log() is called an refer to that as a generation).
    7. we can try to use pythons logging framework here, but i am wondering if we need it. opinions?
    opened by koaning 7
  • Rebranding it as MapReproduce? ;)

    Rebranding it as MapReproduce? ;)

    I know it's kind of arrogant to be some random person hinting for renaming a repo... but I still think it is a good idea. (At PyData Warsaw I chuckled a few times. Yet, 3 days later I still think it is worth doing.)

    As it is essentially a functional (map, filter, kind-of-reduce) approach to evolutionary algorithms, MapReproduce would capture its spirit in its name.

    (No, I don't mean renaming any parts of API, at least not for this reason.)

    opened by stared 7
  • Python version

    Python version

    Currently we only support python 3.6, since we use f'{string}' syntax. On many platforms python3.6 isn't readily available (e.g. raspberry pi, on which I haven't even been able to compile it so far) - don't we want to support 3.4 too?

    opened by rogiervandergeer 7
  • examples won't run, import error

    examples won't run, import error

    I am having trouble making sense of this one.

    ➜  evol git:(master) ✗ pip uninstall evol; pip install .
    Uninstalling evol-0.1:
      /anaconda/lib/python3.6/site-packages/evol-0.1-py3.6.egg-info
    Proceed (y/n)? y
      Successfully uninstalled evol-0.1
    Processing /Users/code/Development/evol
    Requirement already satisfied: pytest in /anaconda/lib/python3.6/site-packages (from evol==0.1)
    Requirement already satisfied: py>=1.4.29 in /anaconda/lib/python3.6/site-packages (from pytest->evol==0.1)
    Requirement already satisfied: setuptools in /anaconda/lib/python3.6/site-packages (from pytest->evol==0.1)
    Installing collected packages: evol
      Running setup.py install for evol ... done
    Successfully installed evol-0.1
    ➜  evol git:(master) ✗ python examples/example_pheromone.py
    Traceback (most recent call last):
      File "examples/example_pheromone.py", line 13, in <module>
        from evol import Population, Evolution
      File "/anaconda/lib/python3.6/site-packages/evol/__init__.py", line 2, in <module>
        from .evolution import Evolution
      File "/anaconda/lib/python3.6/site-packages/evol/evolution.py", line 3, in <module>
        from .population import Population
      File "/anaconda/lib/python3.6/site-packages/evol/population.py", line 2, in <module>
        from evol.helpers.utils import select_arguments
      File "/anaconda/lib/python3.6/site-packages/evol/helpers.py", line 6, in <module>
        from evol.population import Population
    ImportError: cannot import name 'Population'
    

    Did you get something similar? I am wondering if this is my system that is acting up.

    opened by koaning 6
  • how to do a grid search

    how to do a grid search

    Depending on the problem you're dealing with, you may want to supply the population with a predefined list of chromosomes instead of a function to generate a mere individual. This way the end user might specify a grid to start from.

    We could accomodate for this flow by adding a init_chromosomes parameter to Population.__init__ where we would check that either init_func or init_chromosomes are used.

    Any thoughts for/against this? After thinking it over it seems like the only way to do this in our current API is to first generate a population and next overwrite all the individuals.

    opened by koaning 6
  • how to deal with kwargs

    how to deal with kwargs

    def init_func():
        return 1
    
    
    def eval_func(x):
    
        return x
    
    
    def pick_two_random_parents(population):
        return random.choices(population, k=2)
    
    
    def pick_n_random_parents(population, n_parents=2):
        return random.choices(population, k=n_parents)
    
    
    def combine_two_parents(mom, dad):
        return (mom+dad)/2
    
    
    def general_combiner(*parents):
        return sum(parents)/len(parents)
    
    pop1 = Population(init_function=init_func, eval_function=eval_func, size=200)
    pop1.survive(n=50).breed(parent_picker=pick_two_random_parents, combiner=combine_two_parents)
    
    pop2 = Population(init_function=init_func, eval_function=eval_func, size=200)
    pop2.survive(n=50).breed(parent_picker=pick_n_random_parents, combiner=general_combiner)
    

    What is a nice way to change the n_parents parameter? Something like;

    pop2.survive(n=50).breed(parent_picker=pick_n_random_parents, combiner=general_combiner, n_parents=3)
    
    opened by koaning 6
  • Is it possible to implement elitism?

    Is it possible to implement elitism?

    I wrote this function and used it as a callback to attemp elitist selection without success:

    best_chromosome = None
    
    def save_best(population):
        global best_chromosome
        
        population_best = population.current_best
        
        if best_chromosome is None or population_best.fitness > best_chromosome.fitness:
            best_chromosome = deepcopy(population_best)
    

    Is there a way to do it?

    opened by ELC 16
  • CI Fix

    CI Fix

    I've made the repo flake8 compatible again and I've changed the Makefile. Flake8 raised this warning without the change;

    WARNING: flake8 setuptools integration is deprecated and scheduled for removal in 4.x.  For more information, see https://gitlab.com/pycqa/flake8/issues/544
    
    opened by koaning 0
  • WIP: Change the signature of pick_random

    WIP: Change the signature of pick_random

    Parent pickers are no longer passed any kwargs. The pick_random must now be initialised before use, the number of parents passed to it upon initialization. In addition, pickers must always return a sequence of picked parents - even if it is only one.

    These changes make it much easier to implement more complex picking algorithms, and in addition they remove the requirement for the select_arguments() decorator, which hurts my eyes.

    opened by rogiervandergeer 7
  • DiceDots: a problem for contest population

    DiceDots: a problem for contest population

    I heard of a cool puzzle.

    Suppose that you can take a cube and draw 18 dots on it such that you can make a custom dice with custom eyes on each side. You can determine which die can beat other dice so the question is, what is the most balanced die?

    This might be a fun problem to explain the ContestPopulation with and might deserve to make an appearance as a problem instance.

    opened by koaning 0
Releases(0.5.2)
Genetic algorithms are heuristic search algorithms inspired by the process that supports the evolution of life.

Genetic algorithms are heuristic search algorithms inspired by the process that supports the evolution of life. The algorithm is designed to replicate the natural selection process to carry generation, i.e. survival of the fittest of beings.

Mahdi Hassanzadeh 4 Dec 24, 2022
Minimal examples of data structures and algorithms in Python

Pythonic Data Structures and Algorithms Minimal and clean example implementations of data structures and algorithms in Python 3. Contributing Thanks f

Keon 22k Jan 9, 2023
Repository for data structure and algorithms in Python for coding interviews

Python Data Structures and Algorithms This repository contains questions requiring implementation of data structures and algorithms concepts. It is us

Prabhu Pant 1.9k Jan 1, 2023
All Algorithms implemented in Python

The Algorithms - Python All algorithms implemented in Python (for education) These implementations are for learning purposes only. Therefore they may

The Algorithms 150.6k Jan 3, 2023
:computer: Data Structures and Algorithms in Python

Algorithms in Python Implementations of a few algorithms and datastructures for fun and profit! Completed Karatsuba Multiplication Basic Sorting Rabin

Prakhar Srivastav 2.9k Jan 1, 2023
Algorithms implemented in Python

Python Algorithms Library Laurent Luce Description The purpose of this library is to help you with common algorithms like: A* path finding. String Mat

Laurent Luce 264 Dec 6, 2022
Algorithms and data structures for educational, demonstrational and experimental purposes.

Algorithms and Data Structures (ands) Introduction This project was created for personal use mostly while studying for an exam (starting in the month

null 50 Dec 6, 2022
A command line tool for memorizing algorithms in Python by typing them.

Algo Drills A command line tool for memorizing algorithms in Python by typing them. In alpha and things will change. How it works Type out an algorith

Travis Jungroth 43 Dec 2, 2022
Python sample codes for robotics algorithms.

PythonRobotics Python codes for robotics algorithm. Table of Contents What is this? Requirements Documentation How to use Localization Extended Kalman

Atsushi Sakai 17.2k Jan 1, 2023
This is the code repository for 40 Algorithms Every Programmer Should Know , published by Packt.

40 Algorithms Every Programmer Should Know, published by Packt

Packt 721 Jan 2, 2023
Solving a card game with three search algorithms: BFS, IDS, and A*

Search Algorithms Overview In this project, we want to solve a card game with three search algorithms. In this card game, we have to sort our cards by

Korosh 5 Aug 4, 2022
Nature-inspired algorithms are a very popular tool for solving optimization problems.

Nature-inspired algorithms are a very popular tool for solving optimization problems. Numerous variants of nature-inspired algorithms have been develo

NiaOrg 215 Dec 28, 2022
All algorithms implemented in Python for education

The Algorithms - Python All algorithms implemented in Python - for education Implementations are for learning purposes only. As they may be less effic

null 1 Oct 20, 2021
Implementation of Apriori algorithms via Python

Installing run bellow command for installing all packages pip install -r requirements.txt Data Put csv data under this directory "infrastructure/data

Mahdi Rezaei 0 Jul 25, 2022
A simple python application to visualize sorting algorithms.

Visualize sorting algorithms A simple python application to visualize sorting algorithms. Sort Algorithms Name Function Name O( ) Bubble Sort bubble_s

Duc Tran 3 Apr 1, 2022
Programming Foundations Algorithms With Python

Programming-Foundations-Algorithms Algorithms purpose to solve a specific proplem with a sequential sets of steps for instance : if you need to add di

omar nafea 1 Nov 1, 2021
Planning Algorithms in AI and Robotics. MSc course at Skoltech Data Science program

Planning Algorithms in AI and Robotics course T2 2021-22 The Planning Algorithms in AI and Robotics course at Skoltech, MS in Data Science, during T2,

Mobile Robotics Lab. at Skoltech 6 Sep 21, 2022
Algorithms for calibrating power grid distribution system models

Distribution System Model Calibration Algorithms The code in this library was developed by Sandia National Laboratories under funding provided by the

Sandia National Laboratories 2 Oct 31, 2022
My dynamic programming algorithms for exercise and fun

My Dynamic Programming Algorithms giraycoskun [email protected] It is a repo for various dynamic programming algorithms for exercise.

Giray Coskun 1 Nov 13, 2021