Imitation has long been regarded as a compelling method for (social) learning in robots. However, robot imitation faces a number of challenges; one of the most fundamental is determining what to imitate. Although not trivial it is relatively straightforward to imitate actions — something we explored within the Artificial Culture project. But inferring goals from observed actions and thus determining which parts of a demonstrated sequence of actions are relevant, i.e., rational imitation, is very challenging.
The approach we take in our paper Rational imitation for robots: the cost difference model is to equip the imitating robot with a simulation-based internal model that allows the robot to explore alternative sequences of actions needed to attain the demonstrator robot’s potential goals (i.e., goals that are possible explanations for its observed actions). Comparing these actions with those observed in the demonstrator robot enables the imitating robot to infer the goals underlying those observed actions.
Here is the abstract from our paper:
Infants imitate behaviour flexibly. Depending on the circumstances, they copy both actions and their effects or only reproduce the demonstrator’s intended goals. In view of this selective imitation, infants have been called rational imitators. The ability to selectively and adaptively imitate behaviour would be a beneficial capacity for robots. Indeed, selecting what to imitate is one of the outstanding unsolved problems in the field of robotic imitation. In this paper, we first present a formalized model of rational imitation suited for robotic applications. Next, we test and demonstrate it using two humanoid robots.
My colleague Dieter Vanderelst conducted several experiments to demonstrate rational imitation. Let me outline one of them, which uses two NAO robots.
Here panels (a,d,g) show the setup with blue as the demonstrating robot and red the observing (then imitating) robot. Panels (b,e,h) Show trajectories of 3 runs of the demonstrator robot blue, and panels (c,f,i) show trajectories of 3 runs of the imitating robot red. Note that red always starts from the position it observes from, as you would if you were imitating your dance teacher.
In condition 1 blue moves directly to its goal position (panels a,b). Blue infers the goal is to move to red’s goal and does so directly in panel c.
In condition 2 blue deviates around an obstacle even though it has a direct path to its goal (panels d,e). In this case red infers that the deviation must be a sub-goal of blue — since blue is able to go directly to its goal but chooses not to — so in panel f red creates a trajectory via blue’s sub-goal. In other words red has correctly inferred blue’s intentions to imitate its goals.
In condition 3 blue’s path to its goal is blocked so it has no choice but to divert (panels g,h). In this case red infers that blue has no sub-goals and moves directly to the goal position (panel i).
Full paper reference:
Vanderelst, D. and Winfield, A. F. (2017) Rational imitation for robots: The cost difference model. Adaptive Behavior, 25 (2). pp. 60-71. Download pdf.
The experiments here were conceived and conducted by Dr Dieter Vanderelst, within EPSRC project Verifiable Autonomy.