Feb 18 2011
Dr. Watson
Recently an IBM computer program, named Watson, beat the pants off the two top performing Jeopardy champions Ken Jennings and Brad Rutter. This if being hailed as a demonstration of the superiority of computers over human intelligence, and also a breakthrough in intelligent systems (although no one is claiming any sort of artificial consciousness for Watson). The demonstration has also sparked speculation about how systems such as Watson can be applied in the future – with some speculation going too far, in my opinion.
First, let’s find out what Watson actually is. IBM describes Watson as a “system designed for answers” and to work with natural language. They chose Jeopardy (a trivia game show) as a model of this task. This is how they describe the hardware:
Operating on a single CPU, it could take Watson two hours to answer a single question. A typical Jeopardy! contestant can accomplish this feat in less than three seconds. For Watson to rival the speed of its human competitors in delivering a single, precise answer to a question requires custom algorithms, terabytes of storage and thousands of POWER7 computing cores working in a massively parallel system.
It is interesting to know that in order to create the winning performance in real time such a massive system was required – thousands of computing cores. Such a system won’t be sitting on the average desktop anytime soon. However, Moore’s law (assuming it continues to hold up for a while, which seems to be the consensus) predicts that within 20 years or so we will have today’s super computers on a desktop.
The software is perhaps more interesting. The algorithms developed needed to understand natural language, and also they needed to be good at playing Jeopardy. This means not just coming up with the most likely answer, but knowing when to ring in and offer a guess (in a game where a wrong answer costs money). So the algorithm also had to come up with an estimate of the level of confidence in the answer, and have some algorithm as to where to set the threshold for ringing in. And of course Watson had to have a database of information, about a wide range of topics.
In the end the performance was impressive.
In the second day of Jeopardy‘s three-day “Man vs. Machine” special, Watson wiped the floor with Ken Jennings (a 74-time champion) and Brad Rutter (a 20-time champion). Ken Jennings ended day two with $4,800 and Brad Rutter ended with $10,400, while Watson took home $35,734 in prize winnings. Out of 30 answers, Watson was first to buzz in on 25 of them, getting all but one of them right.
What does this performance really mean for the future of computers? That remains to be seen – but it seems to me that Watson is essentially an expert system, a system capable of providing specific information from a vast database. Perhaps the greatest innovation is its ability to interpret natural language, a key goal for human-computer interface.
Whenever such breakthroughs and demonstrations of computing power take place (like when IBM’s Deep Blue beat chess champion Gary Kasparov) this raises the question of what computers are good at, vs the human brain. Computers are great as processing information quickly, at storing and accurately retrieving information, and at running algorithms and simulations. But they still lag way behind humans in pattern recognition. The massively parallel organization of the human brain is optimized for pattern recognition – it’s something we do well. The weakness of the brain, however, is that it is error prone and unreliable.
Computers and brain are therefore complementary in their strengths and weaknesses. It therefore makes sense that the best use of computers is to take over tasks at which computers excel and humans are weak, and also to aid (but not replace) humans in tasks at which humans excel and computers are weak. But the performance of Deep Blue or Watson does not imply that computers are ready to take over from humans what humans are good at. Here is an example of exactly that misguided interpretation.
But, why not take it a step further and just get rid of the human altogether? Literally tens of thousands of people die every year because doctors are, well, only human, and make diagnostic mistakes which can later be identified as violating evidence based medicine.
The problem I always saw was how to convince people that yes, a computer can beat House, MD. Well IBM can’t go head-to-head with House but maybe beating Ken Jennings and Brad Rutter might help.
No, a computer cannot beat House, MD, nor will it be able to anytime soon. The author of that statement, Karl Smith, is correct in noting that medicine has become too complex for mortal humans to practice error-free, or even to master all relevant information. We are probably at, or even beyond, the limit of human capability. This has forced a trend toward specialization – narrowing the field of expertise so as to limit the amount of information needed to master. But specialization has its limits and downsides as well.
Clearly human doctors need help, and I agree that computers are an obvious solution. Incorporating error-reducing systems, like check lists, is another approach. Taking more of a team rather than individual approach to care is another potential solution.
I there is also a role for computer-based expert systems as an aid to the process of diagnosis and treatment. Expert systems, essentially, can represent a vast database of medical knowledge and the ability to provide the relevant information to a clinician at the point of patient care. If such a system were properly employed it can act like a tiny expert sitting on the shoulder of every physician, offering suggestions, checking their recommendations, and screening for errors.
Right now we have subsets of such systems. Doctors can have apps on their smartphones that will give them drug information on demand. Doctors frequently use Google to look up information, often right there in the room with the patient. (As an aside, patients are sometime put off by this, but they shouldn’t be. This does not mean their doctor does not know what they are doing.) And we are increasingly using electronic medical records, that can also provide a layer of computer supervision. I think such systems are currently being underutilized, and there is tremendous room for improvement. But we are heading in this direction.
However – short of truly artificial human-level intelligence, such computer systems will not replace human clinicians. People still need to gather information from their patients, and this requires a great deal of interpretation. The human element is complex and chaotic, and requires the highly sophisticated social skills of another human. Also, diagnosis is often an exercise in pattern recognition – something at which humans are still superior to computers. And there is an element of judgment involved in clinical decision-making that goes beyond any algorithm.
But humans also have weaknesses as clinicians. Sometimes that judgment is flawed, overly influenced by anecdote and recent experience (rather than the best evidence), and subject to a long list of cognitive biases.
The solution, therefore, is again to combine the best of human intelligence and computer support. We are moving slowly in this direction, but we have a long way to go.
The Watson experience perhaps can be a good kick in the pants, to increase support for the incorporation of medical expert systems into the practice of medicine. But it also shows that such systems will not be easy to incorporate. In order to provide real-time natural language information Watson required thousands of computer cores. Perhaps the real lesson of Watson is that we are still 20 years away from widespread use of such systems. When there is a Watson app for the smartphone, every doctor will carry one into the patient room.