My PhD student Paul has started a new series of experiments in embodied evolution in the swarm robotics lab. Here's a picture showing his experiment with 3 Linux e-puck robots in a small circular arena together with an infra-red beacon (at about 2 o'clock).
The task the robots are engaged in collective foraging for food. Actually there's nothing much to see here because the food items are virtual (i.e. invisible) blobs in the arena that the robots have to 'find', then 'pick up' and 'transport' to the nest (again virtually). The nest region is marked by the infra-red beacon - so the robots 'deposit' the food items in the pool of IR light in the arena just in front of the beacon. The reason we don't bother making physical food items and grippers, etc, is that this would entail engineering work that's really not important here. You see, here we are not so interested in collective foraging - it's just a test problem for investigating the thing we're really interested in, which is embodied evolution.
The point of the experiment is this: at the start the robots don't know how to forage for food; during the experiment they must collectively 'evolve' the ability to forage. Paul is here researching the process of collective evolution. Before explaining what's going on 'under the hood' of these robots, let me give some background. Evolutionary robotics has been around for nearly 20 years. The idea is that instead of hand-designing the robot's control system we use an artificial process inspired by Darwinian evolution, called a genetic algorithm. It's really a way of automating the design. Evolutionary algorithms have been shown to be a very efficient way of searching the so-called design space and, in theory, will come up with (literally evolve) better solutions than we can invent by hand. Much more recent is the study of evolutionary swarm robotics (which is why there's no Wikipedia entry yet), which tackles the harder problem of evolving the controllers for individual robots in a swarm such that, collectively, the swarm will self-organise to solve the overall task.
Still with me? Good. Now let me explain what's going on in the robots of Paul's experiment. Each robot has inside it a simulation of itself and it's environment (food, other robots and the nest). That simulation is not run once, but many times over within a genetic algorithm inside the robot. Thus each robot is running a simulation of the process of evolution, of itself, in itself. When that process completes (about once a minute), the best performing evolved controller is transferred into the real robot's controller. Since the embodied evolutionary process runs through several (simulated) generations of robot controller, then the final winner of each evolutionary competition is, in effect, a great great... grandchild of the robot controller at the start of each cycle. While the real robots are driving around in the arena foraging (virtual) food and returning it to the nest, simulated evolution is running - in parallel - as a background process. Every minute or so the real robot's controllers are updated with the latest generation of (hopefully improved) evolved controllers so what we observe is the robots gradually getting better and better at collective foraging. If you think this sounds complicated – it is. The software architecture that Paul has built to accomplish this is ferociously complex and all the more remarkable because it fits within a robot about the size of a salt shaker. But in essence it is like this: what’s going on inside the robots is a bit like you imagining lots of different ways of riding a bike over and over, inside your head, while actually riding a bike.
Putting a simulation inside a robot is something roboticists refer to as ‘robots with internal models’ and if we are to build real-world robots that are more autonomous, more adaptable – in short smarter, this kind of complexity is something we will have to master.
If you’ve made it this far, you might well ask the question, “what if the simulation inside the robot is an inaccurate representation of the real world – won’t that mean the evolved controller will be rubbish?” You would be right. One of the problems that has dogged evolutionary robotics is known as the 'reality gap'. It is the gap between the real world and the simulated world, which means that a controller evolved (and therefore optimised) in simulation typically doesn't work very well - or sometimes not at all - when transferred to the real robot and run in the real world. Paul is addressing this hard problem by also evolving the embedded simulators at the same time as evolving the robot's controllers; a process called co-evolution. This is where having a swarm of robots is a real advantage: just as we have a population of simulated controllers evolving inside each robot, we have a population of simulators - one per robot - evolving collectively across the swarm.
Related blog posts:
Environment-driven distributed evolutionary adaptation
Walterian creatures