Alan Winfield's Web Log: New experiments in the new lab

Wednesday, January 11, 2012

New experiments in the new lab

Last week my PhD student Mehmet started a new series of experiments in embodied behavioural evolution. The exciting new step is that we've now moved to active imitation. In our previous trials robot-robot imitation has been passive; in other words, when robot B imitates robot A, robot A receives no feedback at all - not even that its action has been imitated. With active imitation, robot A receives feedback - it receives information on which of its behaviours has been imitated, how well the behaviour been imitated and by whom.

The switch from passive to active imitation has required a major software rewrite, both for the robots' control code and for the infrastructure. We made the considered decision that the feedback mechanism - unlike the imitation itself - is not embodied. In other words the system infrastructure both figures out which robot has imitated which (not trivial to do) and radios the feedback to the robots themselves. The reason for this decision is that we want to see how that feedback can be used to - for instance - reinforce particular behaviours so that we can model the idea that agents are more likely to re-enact behaviours that have been imitated by other agents, over those that haven't. We are not trying to model active social learning (in which a learner watches a teacher, then the teacher watches the learner to judge how well they've learned, and so on) so we avoid the additional complexity of embodied feedback.

In the first tests with the new active imitation setup we've introduced a simple change to the behaviour selection mechanism. Every robot has a memory with all of its initialised or learned behaviours. Each one of those behaviours now has a counter that gets incremented each time that particular behaviour is imitated. A robot selects which of its stored behaviours to enact, at random, but with probabilities that are determined by the counter values so that a higher count behaviour is more likely to be selected. But, as I've discovered peering at the data generated from the initial runs, it's not at all straightforward to figure out what's going on and - most importantly - what it means. It's the hermeneutic challenge again.

So, for now here's a picture of the experimental setup in our shiny new* lab. Results to follow!