AI is proving exceptionally useful for diagnosing diseases from medical images, such as X-rays, at scale. However, AI tools aren’t always able to recognize their own inaccuracies.
In response, Google has developed a new AI system called the Complementarity-driven Deferral-to-Clinical Workflow (CoDoC), which can discern when to trust AI-based diagnoses and prompt a second opinion from a human doctor.
According to the study, CoDoC reduces the workload of analyzing medical scan data by 66%, but it also detects when an AI decision is potentially wrong, which reduces false positives by 25%.
CoDoC works in parallel with existing AI systems typically used to interpret medical imagery such as chest X-rays or mammograms.
For instance, if an AI tool is interpreting a mammogram, CoDoC assesses whether the tool’s perceived confidence in its analysis is strong enough to be relied upon. If there’s any ambiguity, CoDoC requests a human expert for a second opinion.
Here’s how it works:
- To train CoDoC, Google took data from existing clinical AI tools and compared it with a human clinician’s interpretation of the same images. The model was further validated with post-analysis of data via biopsy or other methods.
- This process enables CoDoC to learn and understand how accurate an AI tool’s analysis and confidence levels are compared to human doctors.
- Once trained, CoDoC can judge whether an AI analysis of scans is trustworthy or if human review is required.
Alan Karthikesalingam at Google Health UK, who was involved in the research, said, “If you use CoDoC together with the AI tool, and the outputs of a real radiologist, and then CoDoC helps decide which opinion to use, the resulting accuracy is better than either the person or the AI tool alone.”
Further testing of CoDoC was conducted using different mammography datasets and X-rays for tuberculosis screening across various predictive AI systems, yielding positive results.
Krishnamurthy Dvijotham at Google DeepMind noted, “The advantage of CoDoC is that it’s interoperable with a variety of proprietary AI systems.”
However, Helen Salisbury from the University of Oxford points out that some medical diagnostic processes are more complex than those CoDoC was tested with. She says, “For systems where you have no chance to influence, post-hoc, what comes out the black box, it seems like a good idea to add on machine learning. Whether it brings AI that’s going to be there with us all day, every day for our routine work any closer, I don’t know.”
As the researchers highlight, CoDoC’s interoperability means it can slot into different diagnostic workflows.
AI systems can work with AI systems to improve their accuracy. As the saying goes, four eyes see more than two.