Thursday, March 28, 2013

A Crisis of Expectations

At the first UK Robot Ethics workshop on 25th March 2013, I offered - for discussion - the proposition that robotics is facing a Crisis of Expectations. And not for the first time. I argue that one possible consequence is (another) AI winter.

Here is a hypertext linked version of my paper.


Introduction

In this talk I set out the proposition that robotics is facing a crisis of expectations. As a community we face a number of expectation gaps - significant differences between what people think robots are and do, and what robots really are and really do, and (more seriously) might reasonably be expected to do in the near future. I will argue that there are three expectation gaps at work here: public expectations, press and media expectations and funder or stakeholder expectations, and that the combined effect of these amounts to a crisis of expectations. A crisis we roboticists need to be worried about.


Public Expectations

Here's a simple experiment. Ask a non-roboticist to give you an example of a robot - the first that comes into their mind. The odds are that it will be a robot from a Science Fiction movie: perhaps Terminator, R2-D2 or C-3PO or Data from Star Trek. Then ask them to name a real-world robot. Unlike your first question, which they will have answered quickly, this one generally needs a little longer. You might get an answer like "the robot in the advert that spray-paints cars" or, if you're lucky, they might know someone with a robot vacuum cleaner. So, although most people have a general idea that there are robots in factories, or helping soldiers to defuse bombs, the robots that they are most familiar with - the ones they can name and describe - are fictional.

None of this is surprising. The modern idea of a robot was, after all, conceived in Science Fiction. Czech playwright K. Capek first used the word Robot to describe a humanoid automaton in his play Rossum’s Universal Robots (RUR) and Isaac Asimov was the first to coin the word Robotics, in his famous short stories of the 1940s. The idea of a robot as an artificial mechanical person has become a ubiquitous fictional trope, and robots have, for half a century, been firmly rooted in our cultural landscape. We even talk about people acting robotically and, curiously, we don't mean like servants, we mean in a fashion that mimics the archetypal robot: stiff jointed and emotionally expressionless.

Furthermore, people like robots, as anyone who has had the pleasure of giving public talks will know. People like robots because robots are, to paraphrase W. Grey Walter, An Imitation of Life. Which probably accounts for the observation that we are all, it seems, both fascinated and disturbed by robots in equal measure. I have often been asked the question "how intelligent are intelligent robots?", but there's always an unspoken rider "...and should we be worried?". Robot dystopias, from Terminator, to The Matrix or out-of-control AI like HAL in Kubrick's 2001, make compelling entertainment but undoubtedly feed the dark side of our cultural love affair with robots.

It is not surprising then, that most people's expectations about robots are wrong. Their beliefs about what real-world robots do now are hazy, and their expectations about what robots might be like in the near future often spectacularly over optimistic. Some think that real-world robots are just like movie robots. Others are disappointed and feel that robotics has failed to deliver the promises of the 1960s. This expectation gap - the gap between what people think robots are capable of and what they're really capable of - is not one-dimensional and is, I argue, a problem for the robotics community. It is a problem that can manifest itself directly when, for instance, public attitudes towards robots are surveyed and the results used to inform policy [1]. It makes our work as roboticists harder, because the hard problems we are working on are problems many people think already solved, and because it creates societal expectations of robotics that cannot be met. And it is a problem because it underpins the next expectation gap I will describe.

Press and Media Expectations

You are technically literate, an engineer or scientist perhaps with a particular interest in robotics, but you've been stranded on a desert island for the past 30 years. Rescued and returned to civilisation you are keen to find out how far robotics science and technology has advanced and - rejoicing in the marvellous inventions of the Internet and its search engines - you scour the science press for robot news. Scanning the headlines you are thrilled to discover that robots are alive, and sending messages from space; robots can think or are "capable of human reasoning or learning"; robots have feelings, relate to humans, or demonstrate love, even behave ethically. Truly robots have achieved their promised potential. 

Then of course you start to dig deeper and read the science behind these stories. The truth dawns. Although the robotics you are reading about is significant work, done by very good people, the fact is - you begin to realise - that now, in 2013, robots cannot properly be said to think, feel, empathise, love or be moral agents; and certainly no robot is, in any meaningful sense, alive or sentient. Of course your disappointment is tempered by the discovery that astonishing strides have nevertheless been made.

So, robotics is subject to journalistic hype. In this respect robotics is not unique. Ben Goldacre has done much to expose bad science reporting, especially in medicine. But a robot is different to, say, a new strain of MRSA because - as I outlined above - most people think they know what a robot is. Goldacre has characterised bad science stories as falling into three categories: wacky stories, scare stories and breakthrough stories [2]. My observation is that robots in the press most often headline as either wacky or scary, even when the development is highly innovative.

I believe that robohype is a serious problem and an issue that the robotics community should worry about. The problem is this. Most people who read the press reports are lay readers who - perfectly reasonably - will not read much beyond the headline; certainly few will look for the source research. So every time a piece of robohype appears (pretty much every day) the level of mass-delusion about what robots do increases a bit more, and the expectation gap ratchets a little wider. Remember that the expectation gap is already wide. We are at the same time fascinated and fearful of robots, and this fascination feeds the hype because we want (or dread) the robofiction to become true. Which is of course one of the reasons for the hype in the first place.

Who's to blame for the robohype? Well we roboticists must share the blame. When we describe our robots and what they do we use anthropocentric words, especially when trying to explain our work to people outside the robotics community. Within the robotics and AI community we all understand that when we talk about an intelligent robot, what we mean is a robot that behaves as if it were intelligent; 'intelligent robot' is just a convenient shorthand. So when we talk to journalists we should not be too surprised when "this robot behaves, in some limited sense, as if it has feelings" gets reported as "this robot has feelings". But science journalists must, I think, also do better than this.

Funder and Stakeholder Expectations

Many of us rely on research grants to fund our work and - whether we like it or not - we have to become expert in the discipline of grantology. We pore over the small print of funding calls and craft our proposals with infinite care in an effort to persuade reviewers (also skilled grantologists) to award the coveted 'outstanding' scores. We are competing for a share of a limited resource, and the most persuasive proposals - the most adventurous, which also promise the greatest impact while matching themes defined to be of national importance - tend to succeed. Of course all of this is more or less equally true whether you are bidding for a grant in history, microbiology or robotics. But the crisis of expectations makes robotics different.

There are, I think, three factors at work. The first is the societal and cultural context - the expectation gaps I have outlined above. The second is the arguably disgraceful lack of useful and widely accepted benchmarks in robotics, which means that it is perfectly possible to spend 3 years developing a new robot which is impossible to quantifiably demonstrate as superior to comparable robots, including those that already existed when that project started. And the third is the fact that policymakers, funders and stakeholders are themselves under pressure to deliver solutions to very serious societal or economic challenges and therefore perhaps too eager to buy into the promise of robotics. Whether naively or wittingly, we roboticists are I believe guilty of exploiting these three factors when we write our grant applications.

I believe we now find ourselves in an environment in which it is now almost de rigueur to over-promise when writing grant applications. Only the bravest proposal writer will be brutally honest about the extreme difficulty of making significant progress in, for instance, robot cognition and admit that even a successful project, which incrementally extends the state of the art, may have only modest impact. Of course I am not suggesting that all grants over promise and under deliver, but I contend that many do and - because of the factors I have set out - they are rarely called to account. Clearly the danger is that sooner or later funding bodies will react by closing down robotics research initiatives and we will enter a new cycle of AI Winter.

AI has experienced "several cycles of hype, followed by disappointment and criticism, followed by funding cuts, followed by renewed interest years or decades later" [3]. The most serious AI Winter in the UK was triggered by the Lighthill Report [4] which led to a more or less complete cancellation of AI research in 1974. Are we heading for a robotics winter? Perhaps not. One positive sign is the identification of Robotics and Autonomous Systems as one of eight technologies of strategic importance to the UK [5]. Another is the apparent health of robotics funding in the EU and, in particular, Horizon 2020. But a funding winter is only the most extreme consequence of the culture of over-promising I have outlined here.

Discussion

I want to conclude this talk with some thoughts on how we, as a community, should respond to the crisis of expectations. And respond we must. We have, I believe, an ethical duty to the society we serve, as well as to ourselves, to take steps to counter the expectation gaps that I have outlined. Those steps might include:
  • At every opportunity, individually and collectively, we engage the public in honest explanation and open dialogue to raise awareness of the reality of robotics. We need to be truthful about the limitations of robots and robot intelligence, and measured with our predictions. We can show that real robots are both very different and much more surprising than their fictional counterparts.
  • When we come across particularly egregious robot reporting in the press and media we make the effort to contact the reporting journalist, to explain simply and plainly the true significance of the work behind the story. 
  • Individually and collectively we endeavour to resist the pressure to over-promise in our bids and proposals, and when we review proposals or find ourselves advising on funding directions or priorities, we seek to influence towards a more measured and ultimately sustainable approach to the long term Robotics Project.


References

[1] Public Attitudes towards Robots. Special Eurobarometer 382, European Commission, 2012.

[2] Ben Goldacre. Don't dumb me down. The Guardian, 8 September 2005.
  
[3] AI Winter. Wikipedia, accessed 14 March 2013.
  
[4] James Lighthill. Artificial Intelligence: A General Survey. In Artificial Intelligence: a paper symposium, Science Research Council, 1973. Here is a BBC televised debate which followed publication of the Lighthill report, in which Donald Michie, Richard Gregory and John McCarthy challenge the report and its recommendations (1973).

Sunday, March 24, 2013

Robotics has a new kind of Cartesian Dualism, and it's just as unhelpful

I believe robotics has re-invented mind-body dualism.

At the excellent European Robotics Forum last week I attended a workshop called AI meets Robotics. The thinking behind the workshop was:
The fields of Artificial Intelligence (AI) and Robotics were strongly connected in the early days of AI, but became mostly disconnected later on. While there are several attempts at tackling them together, these attempts remain isolated points in a landscape whose overall structure and extent is not clear. Recently, it was suggested that even the otherwise successful EC program "Cognitive systems and robotics" was not entirely effective in putting together the two sides of cognitive systems and of robotics.
I couldn't agree more. Actually I would go further and suggest that robotics has a much bigger problem than we think. It's a new kind of dualism which parallels Cartesian brain-mind dualism, except in robotics, it's hardware-software dualism. And like Cartesian dualism it could prove just as unhelpful, both conceptually, and practically - in our quest to build intelligent robots.

While sitting in the workshop last week I realised rather sheepishly that I'm guilty of the same kind of dualistic thinking. In my Introduction to Robotics one of the (three) ways I define a robot is: an embodied Artificial Intelligence. And I go on to explain:
...a robot is an Artificial Intelligence (AI) with a physical body. The AI is the thing that provides the robot with its purposefulness of action, its cognition; without the AI the robot would just be a useless mechanical shell. A robot’s body is made of mechanical and electronic parts, including a microcomputer, and the AI made by the software running in the microcomputer. The robot analogue of mind/body is software/hardware. A robot’s software – its programming – is the thing that determines how intelligently it behaves, or whether it behaves at all.
But, as I said in the workshop, we must stop thinking of cognitive robots as either "a robot body with added AI", or "an AI with added motors and sensors". Instead we need a new kind of holistic approach that explicitly seeks to avoid this lazy with added thinking.


Thursday, March 07, 2013

Extreme debugging - a tale of microcode and an oven

It's been quite awhile since I debugged a computer program. Too long. Although I miss coding, the thing I miss more is the process of finding and fixing bugs in the code. Especially the really hard-to-track-down bugs that have you tearing your hair out - convinced your code cannot possibly be wrong - that something else must be the problem. But then when you track down that impossible bug, it becomes so obvious.

I wanted to write here about the most fun I've ever had debugging code. And also the most bizarre, since fixing the bugs required the use of an oven. Yes, an oven. It turned out the bugs were temperature dependent.

But first some background. The year is 1986. I'm the co-founder of a university spin-out company in Hull, England, called Metaforth Ltd. The company was set up to commercialise a stack-based computer architecture that runs the language Forth natively. In other words Forth is the equivalent of the CPU's assembly language. Our first product was a 16-bit industrial processor which we called the MF1600. It was a 2-card module, designed to plug into the (then) industry standard VME bus. One of the cards was the Central Processing Unit (CPU) - not using a microprocessor, but a set of discrete components using fast Transistor Transistor Logic devices. The other card provided memory, input-output interfaces, and the logic needed to interface with the VME bus.

The MF1600 was fast. It ran Forth at 6.6 Million Forth Instructions Per Second (MIPS). Sluggish of course by today's standards, but in 1986 6.6 MIPS was faster than any microprocessor. Then PCs were powered by the state-of-the-art Intel 286 with a clock frequency of 6MHz, managing around 0.9 Assembler MIPS. And because Forth instructions are higher level than assembler, the speed differential was greater still when doing real work.

Ok, now to the epic debugging...

One of our customers reported that during extended tests in an industrial rack the MF1600 was mysteriously crashing. And crashing in a way we'd not experienced before when running tried and tested code. One of their engineers noted that their test rack was running very hot, almost certainly exceeding the MF1600's upper temperature limit of 55°C. Out of spec maybe, but still not good.

So we knew the problem was temperature related. Now any experienced electronics engineer will know that electrical signals take time to get from one place to another. It's called propagation delay, and these delays are normally measured in billionths of a second (nanoseconds). And propagation delays tend to increase with temperature. Like any CPU our MF1600 relies on signals getting to the right place at the right time. And if several signals have to reach the same place at the same time then even a small extra delay in one of them can cause major problems.

On most CPUs when each basic instruction is executed, a tiny program inside the CPU actually does the work of that instruction. Those tiny programs are called microcode. Here is a blog post from several years ago where I explain what microcode is. Microcode is magic stuff - it's the place where software and hardware meet. Just like any program microcode has to be written and debugged, but uniquely - when you write microcode - you have to take account of how long it takes to process and route signals and data across the CPU: 100nS from A to B; 120nS from C and D, and so on. So if the timing in any microcode is tight (i.e. only just allows for the normal delay and leaves no margin of error), it could result in that microcode program crashing at elevated temperatures.

So, we reckoned we had one, or possibly several, microcode programs in the MF1600 CPU with 'tight' timing. The question was, how to find them.

The MF1600 CPU had around 86 (Forth) instructions, and the timing bugs could be in any of them. Now testing microcode is very difficult, and the nature of the problem made the testing problem even worse. A timing problem at elevated temperatures means that testing the microcode by single-stepping the CPU clock and tracing the signals through the CPU with a logic analyser wouldn't help at all. We needed a way to efficiently identify the buggy instructions. Then we could worry about debugging them later. What we wanted was a way to test (i.e. exercise single instructions, one by one), on a running system at high temperatures.

Then we remembered that we don't need all 86 instructions to run the computer. Most of them can be emulated by putting together a set of simpler instructions. So a strategy formed: (1) write a set of tiny Forth programs that replace as many of the CPU instructions as possible, (2) recompile the operating system, then (3) hope that the CPU runs ok at high temperature. If it does then (4) run the CPU in an oven and one by one test the replaced instructions.

Actually it didn't take long to do steps (1) and (2), because the Forth programs already existed to express more complex instructions as sets of simpler ones. Many Forth systems on conventional microprocessor systems were built like that. In the end we had a minimal set of about 24 instructions. So, with the operating system recompiled and installed we put the CPU into the oven and switched on the heat. The system ran perfectly (but a little slower than usual), and continued to run well above the temperature it had previously crashed. A real stroke of luck.

Here's an example of a simple Forth instruction to replace two values on the stack with the smaller of those values, expressed as a Forth program we call MIN
: MIN  OVER OVER > IF SWAP THEN DROP ;
(From my 1983 book The Complete Forth).

From then on it was relatively easy to run small test programs to exercise the other 62 instructions (which were of course still there in the CPU - just not used by the operating system). A couple of days work and we found the rogue 2 instructions that were crashing at temperature. They were - as you might have expected - rather complex instructions. One was (LOOP) an instruction for do loops.

Then debugging those instructions simply required studying the microcode and the big chart with all the CPU delay times, over several pots of coffee. Knowing (or strongly suspecting) that what we were looking for were timing problems, called race hazards, where the data from one part of the CPU just doesn't have time to get to another part in time to be used for the next step of the microcode program. Having identified the suspect timing I then re-wrote the microcode for those instructions to leave a bit more time - by adding one clock cycle to each instruction (50nS).

Then reverting to the old non-patched operating system, it was the moment of truth. Back in the oven, cranking up the temperature, while the CPU was running test programs specifically designed to stress those particular instructions. Yes! The system didn't crash at all, over several days of running at temperature. I recall pushing the temperature above 100°C. Components on the CPU circuit board were melting, but still it didn't crash.

So that's how we debugged code with an oven.