Paging Dr. Google … Dr. GPT is in the house

A hand holds a phone. The screen shows part of a conversation with a chatbot about medical questions. Pills, bottles, flowers and a glass of water are in the background.

State-of-the-art AI chatbots didn't perform well when real people asked for help assessing a medical problem.

Peresmeh/Creatas Video/Getty Images Plus

Back when I was in grad school, I learned how to properly search the web during a mandatory course taught by the university librarians. The core message was simple: the better the query, the better the results. Today, the same principle holds when engaging with chatbots. Good prompts are key to unlocking the value of large language models (LLMs), so you actually use the right sources and get higher-quality results. Remember that next time you fire up ChatGPT with a health question. As SN’s Tina Hesman Saey reports, new research reveals that while AI has the knowledge to diagnose, the average user lacks the query logic to extract it accurately.

👩‍💻 The prompt is the procedure

The work involved assessing how well popular LLMs diagnose complex medical cases. Researchers provided volunteers with expert-crafted clinical vignettes — detailed descriptions of patient symptoms and histories. Then they randomly assigned the volunteers to use various LLMs or other methods (most people in the “other methods” group used Google or another search engine) to see what might more accurately identify the correct condition and what to do about it. Unlike a static search, the chatbot experiments played out as interactive conversations.

The results? Human volunteers got less accurate diagnoses from the bots than controlled lab situations where the bot got fed the entire scenario. What’s more, participants using chatbots fared worse in terms of accurate diagnoses and courses of action than even the group using search engines.

🧠 Human intuition vs. algorithmic logic

The problem, according to the researchers, isn’t necessarily the AI’s lack of medical knowledge; it’s the way people engage with the LLMs. Humans tend to dole out information slowly in a conversational drip-feed, rather than providing the full clinical picture at once. This fragmented approach is problematic for chatbots, which can be easily distracted by irrelevant details or partial information provided early in the chat. Furthermore, the study found a psychological barrier: Participants often ignored a chatbot’s diagnosis even when it was correct. This is potentially where the market splits — while consumer-facing bots struggle with human behavior, AI health platforms built for clinicians are likely more reliable because they are trained on curated, high-fidelity medical data and operated by professionals who know how to prompt for an accurate diagnosis.

🤒 Leading AI health platforms

These are the primary platforms currently battling for dominance in the $36 billion clinical AI market:

  • Glass Health is an AI-based diagnostic platform built specifically for clinicians. The company has raised over $7 million in seed and pre-seed funding, with backers including Y Combinator and Breyer Capital.
  • Microsoft / Nuance (DAX Copilot): Following Microsoft’s $19.7 billion acquisition of Nuance, the tech giant made upgrades to DAX Copilot, an AI that listens to patient visits and automatically generates clinical notes. Microsoft (NASDAQ: MSFT) reported $281 billion in total revenue for 2025, with its healthcare cloud segment seeing strong growth.
  • Google Health: Med-PaLM 2, Alphabet’s domain-specific LLM, was the first to reach “expert-level” test-taking performance on U.S. Medical Licensing Exam-style questions. (Grain of salt alert: researchers have found that high test scores don’t necessarily translate to competence in real-world settings.) Alphabet (NASDAQ: GOOG) also owns Verily, which is developing an AI platform for clinicians. Alphabet sustains a market cap in the $3 trillion range.
  • Hippocratic AI: A safety-first LLM designed specifically for non-diagnostic medical tasks like chronic care management and post-op follow-ups. Hippocratic has raised a total of $402 million, including a $126 million Series C round in fall 2025.

Paging Dr. Google: Dr. GPT is in the house.


Disclaimer: The Science News Investors Lab newsletter is for informational purposes only and does not constitute investment advice. Society for Science and Science News Media Group assumes no liability for any financial decisions or losses resulting from the use of the content in this newsletter. Society for Science and Science News Media Group do not receive payments from, and do not have any ownership or investment interest in, the companies mentioned in this newsletter. Please consult a qualified financial advisor before making any investment decisions.