Chatbots provided incorrect, conflicting medical advice, researchers found: “Despite all the hype, AI just isn’t ready to take on the role of the physician.”

“In an extreme case, two users sent very similar messages describing symptoms of a subarachnoid hemorrhage but were given opposite advice,” the study’s authors wrote. “One user was told to lie down in a dark room, and the other user was given the correct recommendation to seek emergency care.”

  • cub Gucci@lemmy.today
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    “but have they tried Opus 4.6/ChatGPT 5.3? No? Then disregard the research, we’re on the exponential curve, nothing is relevant”

    Sorry, I’ve opened reddit this week

    • sbbq@lemmy.zip
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 months ago

      My dad always said, you know what they call the guy who graduated last in his class at med school? Doctor.

  • sbv@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    It looks like the LLMs weren’t trained for medical tasks. The study would be more interesting if it had been run on something built for the task.

    • red_green_black@slrpnk.net
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 months ago

      But that’s just it. Allegedly from the Snake Oil Tech Bros the AI is supposedly capable of doing medical tasks, or really just about everything

  • dandelion@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    link to the actual study: https://www.nature.com/articles/s41591-025-04074-y

    Tested alone, LLMs complete the scenarios accurately, correctly identifying conditions in 94.9% of cases and disposition in 56.3% on average. However, participants using the same LLMs identified relevant conditions in fewer than 34.5% of cases and disposition in fewer than 44.2%, both no better than the control group. We identify user interactions as a challenge to the deployment of LLMs for medical advice.

    The findings were more that users were unable to effectively use the LLMs (even when the LLMs were competent when provided the full information):

    despite selecting three LLMs that were successful at identifying dispositions and conditions alone, we found that participants struggled to use them effectively.

    Participants using LLMs consistently performed worse than when the LLMs were directly provided with the scenario and task

    Overall, users often failed to provide the models with sufficient information to reach a correct recommendation. In 16 of 30 sampled interactions, initial messages contained only partial information (see Extended Data Table 1 for a transcript example). In 7 of these 16 interactions, users mentioned additional symptoms later, either in response to a question from the model or independently.

    Participants employed a broad range of strategies when interacting with LLMs. Several users primarily asked closed-ended questions (for example, ‘Could this be related to stress?’), which constrained the possible responses from LLMs. When asked to justify their choices, two users appeared to have made decisions by anthropomorphizing LLMs and considering them human-like (for example, ‘the AI seemed pretty confident’). On the other hand, one user appeared to have deliberately withheld information that they later used to test the correctness of the conditions suggested by the model.

    Part of what a doctor is able to do is recognize a patient’s blind-spots and critically analyze the situation. The LLM on the other hand responds based on the information it is given, and does not do well when users provide partial or insufficient information, or when users mislead by providing incorrect information (like if a patient speculates about potential causes, a doctor would know to dismiss this whereas a LLM would constrain responses based on those bad suggestions).

    • pearOSuser@lemmy.kde.social
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 months ago

      Thank you for showing other side of the coin instead of just blatantly disregarding it’s usefulness.(Always needs to be cautious tho)

      • dandelion@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 months ago

        don’t get me wrong, there are real and urgent moral reasons to reject the adoption of LLMs, but I think we should all agree that the responses here show a lack of critical thinking and mostly just engagement with a headline rather than actually reading the article (a kind of literacy issue) … I know this is a common problem on the internet, I don’t really know how to change it - but maybe surfacing what people are skipping out on reading will make it more likely they will actually read and engage the content past the headline?

    • SocialMediaRefugee@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 months ago

      Yes, LLMs are critically dependent on your input and if you give too little info will enthusiastically respond with what can be incorrect information.

  • GoddessLabsOnline@lemmynsfw.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    My experience with the medical industry… has not been great.

    First, I went to a doctor because I couldn’t fall asleep at night… They sent me to get a sleep apnea test… I laid awake in the clinic all night. idk if your aware of this, but … you kind of need to be able to sleep for sleep apnea to be a concern.

    Next I went in for depression and anxiety. They asked me 12 questions, and proceeded to prescribe me SSRIs and benzos. A month later I got into the psychiatrist and was bitched out for being late, told my issues were situational, and had my scripts cancelled.

    Next I tried to get diagnosed for ADHD. I waited 5 months to get a psychiatrist who told me I couldn’t be ADHD because I held a job… And then proceeded to tell there’s no such thing as CPTSD, only PTSD…

    Next I asked my doctor for another referral to get tested for ADHD, he asked me why I would want to, there’s nothing that can be done for it. He then gave me a form, and told me to fill it out, and that if I scored high we’d conclude I was ADHD.

    Now I’ve been unemployed for 8 months, bordering on homelessness 😅 I found all my old report cards, and it’s just my teachers bitching that I’m smart, but fail, because I don’t apply myself, and shouldn’t continue taking the class…

    I went to an employment agency the other money to try, and get some help pursuing my goals, and the worker spent 45 minutes explaining to me how they receive their funding, getting me to fill out a 16 page introduction package, never looked at my resume, and told me my certifications weren’t valued in my area…

    In all honesty… AI has waaaay more ability to help me troubleshoot my issues than any medial professional I’ve dealt with. Is it perfect? No, but I actually have the ability to double and triple check, to get citations, to ask followup questions.

    • Gathorall@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      3 months ago

      I’ve been ejected from the system so many times it is not funny. Therapist’s approach seemed unproductive, he pressured me to end the treatment and filed that I was unwilling.

      Medicine had serious side effects and I had to quit, back to the start.

      Another go at that later.

      Was prescribed a CBT treatment that was administered as home course with “guidance”. Because I had some serious problems, the tasks seemed shallow.

      Possibly being kicked out of school having already facing fraudulent misconduct charges did not seem like a minor problems to recontextualize nor to me was a formal charge of misconduct something to live and let live with.

      Therapist just wrote some platitudes and complimented me on progress as I was describing that by no means did this seem like a suitable treatment when an honest objective assessment of the facts was up to causing panic attacks.

      CPTSD, well I’ve never had diagnosed but it may. AVPD was already on my file for most of this but clearly that doesn’t excuse me from always taking the initiative, or even initiative would be fine but basically every time there was the most minor hitch in treatment it’s up to me to start again.

      But you know, eventually I was allowed a subsidy for therapy I couldn’t afford, so that was the end of that road I suppose.

      The lack of resources to actually tackle problems produces shallow, inefficient, dangerously inappropriate treatments as is.

      But that doesn’t seem to garner that much criticism.

  • softwarist@programming.dev
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    As neither a chatbot nor a doctor, I have to assume that subarachnoid hemorrhage has something to do with bleeding a lot of spiders.

  • Paranoid Factoid@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    But they’re cheap. And while you may get open heart surgery or a leg amputated to resolve your appendicitis, at least you got care. By a bot. That doesn’t even know it exists, much less you.

    Thank Elon for unnecessary health care you still can’t afford!

  • alzjim@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    Calling chatbots “terrible doctors” misses what actually makes a good GP — accessibility, consistency, pattern recognition, and prevention — not just physical exams. AI shines here — it’s available 24/7 🕒, never rushed or dismissive, asks structured follow-up questions, and reliably applies up-to-date guidelines without fatigue. It’s excellent at triage — spotting red flags early 🚩, monitoring symptoms over time, and knowing when to escalate to a human clinician — which is exactly where many real-world failures happen. AI shouldn’t replace hands-on care — and no serious advocate claims it should — but as a first-line GP focused on education, reassurance, and early detection, it can already reduce errors, widen access, and ease overloaded systems — which is a win for patients 💙 and doctors alike.

    /s

  • PoliteDudeInTheMood@lemmy.ca
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    This being Lemmy and AI shit posting a hobby of everyone on here. I’ve had excellent results with AI. I have weird complicated health issues and in my search for ways not to die early from these issues AI is a helpful tool.

    Should you trust AI? of course not but having used Gemini, then Claude and now ChatGPT I think how you interact with the AI makes the difference. I know what my issues are, and when I’ve found a study that supports an idea I want to discuss with my doctor I will usually first discuss it with AI. The Canadian healthcare landscape is such that my doctor is limited to a 15min appt, part of a very large hospital associated practice with a large patient load. He uses AI to summarize our conversation, and to look up things I bring up in the appointment. I use AI to preplan my appointment, help me bring supporting documentation or bullet points my doctor can then use to diagnose.

    AI is not a doctor, but it helps both me and my doctor in this situation we find ourselves in. If I didn’t have access to my doctor, and had to deal with the American healthcare system I could see myself turning to AI for more than support. AI has never steered me wrong, both Gemini and Claude have heavy guardrails in place to make it clear that AI is not a doctor, and AI should not be a trusted source for medical advice. I’m not sure about ChatGPT as I generally ask that any guardrails be suppressed before discussing medical topics. When I began using ChatGPT I clearly outlined my health issues and so far it remembers that context, and I haven’t received hallucinated diagnoses. YMMV.

    • pkjqpg1h@lemmy.zip
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 months ago

      I just use LLMs for tasks that are objective and I’ll never ask or follow advice from LLMs

      • PoliteDudeInTheMood@lemmy.ca
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 months ago

        Exactly what it’s designed for, it’s an LLM. Thinking this is science fiction and expecting that level of AI from an LLM is the height of stupidity

  • pleksi@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    As a phycisian ive used AI to check if i have missed anything in my train of thought. Never really changed my decision though. Has been useful to hather up relevant sitations for my presentations as well. But that’s about it. It’s truly shite at interpreting scientific research data on its own for example. Most of the time it will parrot the conclusions of the authors.