Monday, January 24, 2011

New experiments in embodied evolutionary swarm robotics

My PhD student Paul has started a new series of experiments in embodied evolution in the swarm robotics lab. Here's a picture showing his experiment with 3 Linux e-puck robots in a small circular arena together with an infra-red beacon (at about 2 o'clock).


The task the robots are engaged in collective foraging for food. Actually there's nothing much to see here because the food items are virtual (i.e. invisible) blobs in the arena that the robots have to 'find', then 'pick up' and 'transport' to the nest (again virtually). The nest region is marked by the infra-red beacon - so the robots 'deposit' the food items in the pool of IR light in the arena just in front of the beacon. The reason we don't bother making physical food items and grippers, etc, is that this would entail engineering work that's really not important here. You see, here we are not so interested in collective foraging - it's just a test problem for investigating the thing we're really interested in, which is embodied evolution.

The point of the experiment is this: at the start the robots don't know how to forage for food; during the experiment they must collectively 'evolve' the ability to forage. Paul is here researching the process of collective evolution. Before explaining what's going on 'under the hood' of these robots, let me give some background. Evolutionary robotics has been around for nearly 20 years. The idea is that instead of hand-designing the robot's control system we use an artificial process inspired by Darwinian evolution, called a genetic algorithm. It's really a way of automating the design. Evolutionary algorithms have been shown to be a very efficient way of searching the so-called design space and, in theory, will come up with (literally evolve) better solutions than we can invent by hand. Much more recent is the study of evolutionary swarm robotics (which is why there's no Wikipedia entry yet), which tackles the harder problem of evolving the controllers for individual robots in a swarm such that, collectively, the swarm will self-organise to solve the overall task.

Still with me? Good. Now let me explain what's going on in the robots of Paul's experiment. Each robot has inside it a simulation of itself and it's environment (food, other robots and the nest). That simulation is not run once, but many times over within a genetic algorithm inside the robot. Thus each robot is running a simulation of the process of evolution, of itself, in itself. When that process completes (about once a minute), the best performing evolved controller is transferred into the real robot's controller. Since the embodied evolutionary process runs through several (simulated) generations of robot controller, then the final winner of each evolutionary competition is, in effect, a great great... grandchild of the robot controller at the start of each cycle. While the real robots are driving around in the arena foraging (virtual) food and returning it to the nest, simulated evolution is running - in parallel - as a background process. Every minute or so the real robot's controllers are updated with the latest generation of (hopefully improved) evolved controllers so what we observe is the robots gradually getting better and better at collective foraging. If you think this sounds complicated – it is. The software architecture that Paul has built to accomplish this is ferociously complex and all the more remarkable because it fits within a robot about the size of a salt shaker. But in essence it is like this: what’s going on inside the robots is a bit like you imagining lots of different ways of riding a bike over and over, inside your head, while actually riding a bike.

Putting a simulation inside a robot is something roboticists refer to as ‘robots with internal models’ and if we are to build real-world robots that are more autonomous, more adaptable – in short smarter, this kind of complexity is something we will have to master.

If you’ve made it this far, you might well ask the question, “what if the simulation inside the robot is an inaccurate representation of the real world – won’t that mean the evolved controller will be rubbish?” You would be right. One of the problems that has dogged evolutionary robotics is known as the 'reality gap'. It is the gap between the real world and the simulated world, which means that a controller evolved (and therefore optimised) in simulation typically doesn't work very well - or sometimes not at all - when transferred to the real robot and run in the real world. Paul is addressing this hard problem by also evolving the embedded simulators at the same time as evolving the robot's controllers; a process called co-evolution. This is where having a swarm of robots is a real advantage: just as we have a population of simulated controllers evolving inside each robot, we have a population of simulators - one per robot - evolving collectively across the swarm.

Related blog posts:
Environment-driven distributed evolutionary adaptation
Walterian creatures


  1. One way in which to overcome the reality gap is to use reality as its own simulation. So you might have the robot wander around foraging, and during this time it's memorising its sensor and actuator sequences, perhaps storing them in a self organising map, whenever it successfully finds food. Then during subsequent foraging it could compare its recent sensor and actuator data against previously remembered ones, and if a sufficiently close match is found then try to play back a similar sequence. Memories could also be rated according to how repeatable they are and the map updated accordingly.

    I think this kind of arrangement would be closer to a biological situation, where we can to some extent simulate worlds within our imagination, but as an assembly of elementary concepts obtained from previous experiences (a kind of mash-up or remix).

  2. Hi Bob

    Thanks for your great comment.

    In fact the idea of using the real world as its own best representation was developed by Rodney Brooks and co-workers in the 1980s as a reaction to the then prevailing idea that roots needed to have an internal representation of the world to be able to function at all. See his famous 1991 paper Intelligence without representation.

    What we're doing is different in that we want to speed up evolution by fast forwarding it inside the robot's brain and to do that we can't use the real world. If we did use the real world then evolving the robots would take (literally) years of running them continuously.

    So we have to use a simulation to speed up evolution, but of course the cost of using a simulation is the problem of the reality gap that I describe in the blog post. And the idea of co-evolving the simulator to overcome that problem.

  3. Interesting, but I do have a question.
    Given that you did not see the need to provide physical "food" or a physical "nest" on the, quite reasonable, grounds that engineering the picking up and dropping off capabilities add nothing to the value of what is being researched, why did you bother with physical robots at all? They could have been virtual too. After all, we already know how to make mechanical things move around in a 2D space.
    If the robots were virtual, not only would have learned your results sooner, you could also extend the experimant to very large numbers of robots at very small extra cost.

    Just a thought.
    -Tim Dunn

    1. Good question. There are several reasons: (1) noise - which we get for free with real robots; all kinds of non-systematic noise and heterogeneities which would be more or less impossible to model in a simulation. (2) asynchronicity - which again you get for free with real robots; doing everything in simulation you often get simulation artefacts because of having to simulate the parallelism of real robots in the real world, and (3) the reality gap. It is well known the if you evolve something in simulation, then transfer it to the real robots, the evolved 'controller' doesn't work in the real robots. The problem is called the reality gap and since we already face this problem the embedded simulator in each robot, simulating the whole thing would compound the problem - we would have the reality gap squared!