Sep 26 2019

Using AI for Diagnosis

A recent systematic review and meta-analysis of studies comparing humans to artificial intelligence (AI) in diagnosing radiographic images found:

Analysis of data from 14 studies comparing the performance of deep learning with humans in the same sample found that at best, deep learning algorithms can correctly detect disease in 87% of cases, compared to 86% achieved by health-care professionals.

This is the first time we have evidence that the performance of AI has ticked over that of humans. The authors also state, as is often the case, that we need more studies, we need real world validation, and we need more direct comparisons. But at the very least AI is in the ball park of human performance. If our experience with chess and Go are any guide, the AI algorithms will only get better from here. In fact, given the lag in research, publication, and then review time, it probably already is.

I think AI is poised to overtake humans in diagnosis more broadly, because this particular task is right in the sweet spot of deep learning algorithms. Also, it is very challenging for humans, who fall prey to a host of cognitive biases and heuristics that hamper optimal diagnosis. A lot of medical education is focused on correcting these biases, and replacing them with clinical decision-making that works better. But no clinician is perfect or without blind-spots. Even the most experience clinician also has to contend with an overwhelming amount of information.

There are a couple ways to approach diagnosis. The one in which human excel is the gestalt approach, which is essentially based on pattern recognition. With experience clinicians learn to recognize the overall pattern of a disease. This pattern may include signs or symptoms that are particularly predictive. Eventually the pieces just click into place automatically.

But even then clinicians need to back up their intuition and pattern recognition with the analytical approach. Even if you are 99% confident in your diagnosis, that still means you will be wrong 1% of the time, which for a busy clinician could be once every week or two. So clinicians also learn to hedge their bets, and to cover the things they cannot afford to miss. Therefore they combine this approach with a risk vs benefit analysis, in an attempt to maximize benefit and minimize risk.

But here there are biases also. We tend to be risk averse, and to not want to be the proximate cause of harm. Would you rather take a 20% risk of directly causing harm or a 40% chance of passively letting harm occur? In order to have the best outcomes for patients you would take the 20% risk, but we may reflexively recoil from this option.

In the end diagnosis is a lot of statistics and math. To do this optimally you need to have a lot of information at your fingertips. This is exactly what computers are good at. If you have deep learning AI, which can have a vast database of information, and use algorithms to chart the optimal path to the correct diagnosis, this should easily outperform the best human diagnostician. The real trick is in developing and maintaining these systems.

As a simple example, you may have played with the Akinator, which uses an algorithm to guess what character or thing you are thinking about among the entire world of possibilities. The algorithm whittles the possibilities down in ways that may not be intuitive, but is surprisingly accurate.

I think we are poised for an AI revolution in medicine. Medicine is already hard, and is getting more difficult every day as thousands of studies are published updating our collective database of deep information. Decision-making is also complex, and there are many pitfalls. What we need to do is combine what humans do well with what AI does well. This can vastly reduce errors in medicine and increase efficiency, something which is greatly needed to stem the rising costs of health care.

Here’s what humans do well – they interact with other humans. This means extracting clinically useful information and putting it into context. This is more difficult than you may intuitively think if you are not a clinician. Patients don’t give you objective facts you can plug into an AI. They tell you their own biased narrative of their own illness, which is full of distortions and error. Deconstructing and then reconstructing a medical narrative using various sources of information is a bit of an art form, and can take years to develop.

People are also needed to explain to patients their options, and then help them choose among various options. Computers are great at crunching numbers, but they cannot tell you the relative value of quality of life vs duration of life. You cannot optimize outcomes unless you know what criteria to use. Should the AI follow the diagnostic path that is quickest, least painful, most accurate, least expensive, entails the lowest risk of complications, or the greatest chance of a cure? In many everyday cases there may be one clear path, but often times there isn’t. This is a conversation that an experienced human clinician needs to have with the patient, weighing all the variable and then personalizing the approach to that patient’s priorities.

So an AI diagnostic algorithm would be a tool that the clinician would use. We basically need access to one is every doctor’s office, clinic, and hospital floor. This will not replace human clinical decision-making, it will enhance it. People make mistakes, they have gaps in their knowledge, they forget things, and they are biased by emotion and recent experience. Computers have none of these flaws. They are cold, consistent, and logical, and they can draw on a database of specific information.

AI could, for example, after inputting the relevant symptoms and results from preliminary medical tests, create a list of possible diagnoses with exact percentages. Then it can chart pathways for narrowing the list down, knowing which specific information will have the greatest predictive value.

AIs could also use deep learning not just to use information but to create information. Remember – these are learning algorithms. Plug them into real world experience, meaning that the system is told what the ultimate diagnosis is, they can use that information to improve their own percentages. If every hospital were using a networked diagnosis deep learning AI, and all patient outcomes were recorded, this would be a vast epidemiological study. Decades of epidemiological research could be done in years.

We have the technology right now to do this. We just need to do it. This would be a worthwhile investment, likely saving health care costs many orders of magnitude greater than the infrastructure investment. Not to mention, of course, human lives and suffering.

There is research and development being done. It’s just not nearly at the scale where it should be, given the potential. This is an area where the government can make a massive investment, and save money in the long run. There is an advantage to having one integrated system, and that probably needs to come from the top down. It may only require giving industry a nudge, some kind of incentive, with specific goals. It will happen eventually, even without government involvement, but we should be doing everything to make it happen as soon as possible.

Like this post? Share it!

No responses yet