MIT AI system knows when to make a medical diagnosis or defer to an expert

The human-AI hybrid is more accurate than humans or AI on their own.

BENCHAMAT1234 via Getty Images

AI can now detect lung, breast, brain, skin and cervical cancer. But in the world of medical AI, figuring out when to rely on experts versus algorithms is still tricky. It’s not simply a matter of who is “better” at making a diagnosis or prediction. Factors like how much time medical professionals have and their level of expertise also come into play. To address this, researchers from MIT’s Computer Science and Artificial Intelligence Lab (CSAIL) developed a machine learning system that can decide to either make a prediction or defer to an expert.

Most importantly, the system can adapt when and how often it defers to a human expert, based on that teammate’s availability, experience and scope of practice. For instance, in a busy hospital setting, the system may ask for human assistance only when it’s absolutely necessary.

The researchers trained the system on multiple tasks, including looking at chest X-rays to diagnose conditions like a collapsed lung. When asked to diagnose cardiomegaly (an enlarged heart), the human-AI hybrid model performed eight percent better than either the AI or medical professionals could on their own.

“There are many obstacles that understandably prohibit full automation in clinical settings, including issues of trust and accountability,” says David Sontag, lead author of a paper that the CSAIL team presented at the International Conference on Machine Learning. “We hope that our method will inspire machine learning practitioners to get more creative in integrating real-time human expertise into their algorithms.”

Next, the researchers will test a system that works with and defers to several experts at once. For instance, the AI might collaborate with different radiologists who are more experienced with different patient populations.

The team also believes their system could have implications for content moderation because it’s able to detect offensive text and images. As social media companies struggle to remove misinformation and hate, a tool like this could help alleviate some of the burden on content moderators without resorting to full automation.