Microsoft AI outperforms doctors in tough diagnoses

Microsoft AI outperforms doctors in diagnosing illness
The Microsoft logo on the facade of the building. Photo: Unsplash

Microsoft has presented the MAI-DxO AI system, which made the right diagnosis in most complex medical cases and outperformed experienced doctors. In the trial based on 304 clinical cases, the tool gave an accurate answer in 85.5% of situations, while doctors got it right in only one in five cases.

GeekWire writes about it.

Advertisement

How did AI manage to outperform experienced doctors?

Microsoft's new AI diagnostic orchestrator competed with 21 doctors from the United States and the UK, working through complex cases from the New England Journal of Medicine. Working in conjunction with various large language models — from GPT and Gemini to Grok and DeepSeek — the system showed the best results when paired with OpenAI o3.

MAI-DxO follows the approach of a real specialist: it analyses symptoms, clarifies details, and suggests examinations, but at the same time optimises costs by avoiding unnecessary procedures.

The company acknowledges that doctors do not work in isolation in their daily practice and are able to consult with colleagues and use sources of information. However, the new benchmark, based on modern clinical cases, requires consistent diagnostic thinking that brings the test closer to real life, unlike the USMLE tests, where AI already scores almost the maximum points.

MAI-DxO's further development includes trials on more common diseases, clinical safety and efficacy studies, and regulatory approval.

The company emphasizes that the tool is not intended to replace doctors but to enhance their work by automating routine tasks, assisting with diagnostics, and formulating personalized treatment strategies.

As a reminder, the researchers at the Anthropic company instructed their language model, Claude, to run the small "automated shop" in the office for a month. The experiment resulted in unexpected incidents ranging from trading metal cubes at the loss to the fictional Venmo account and the AI's self-awareness crisis.

Microsoft medicine neural network research AI ChatGPT
Advertisement
Advertisement
Advertisement
Advertisement