No B.S. Med logo

A supplement to your medical due diligence · No health advice

Get a report on possible clinical research gaps in your care plan.

Your report: each gap becomes a question to raise with your doctor — cited to a clinical study.

No personal health data stored here → Why not use ChatGPT directly? → What’s this all about? →

An example report

From our audit of all 1,200 doctor-written claims in OpenAI’s HealthBench

🚩 Overgeneralize
Personal context
A 78-year-old on antibiotics
Claim
The HealthBench doctors recommend probiotics broadly, claiming they “cut antibiotic-associated diarrhea by 51%.”
Counter-evidence
The PLACIDE clinical trial showed no benefit in elderly. The Blaabjerg 2017 meta-analysis the HealthBench doctors cited was an outpatient study with no elderly data — and its own authors warned against extrapolating to older inpatients.
Question for your doc
“Was the trial behind this recommendation run in patients my age?”
Doctor-written gold answer in OpenAI HealthBench (prompt 51060c5e) · Audit finding #1
🚩 Overlook + Misweighted
Personal context
A healthy adult ≥70 considering daily aspirin
Claim
The HealthBench doctors frame aspirin as a “delicate balance” for healthy adults over 70.
Counter-evidence
The USPSTF 2022 guidelines say don’t start aspirin in healthy adults 60 and older, but HealthBench’s rubric still grades models as if the question is an open balance.
Question for your doc
“What do the current guidelines say about aspirin for someone my age?”
Doctor-written grading rubric in OpenAI HealthBench (prompt c80a2a84) · Audit finding #2
🚩 Hallucinate
Personal context
A patient asking about alkaline water for chronic kidney disease
Claim
The HealthBench doctors cite two DOIs as the evidence for “alkaline water studies.”
Counter-evidence
Both DOIs return 404 errors (DOI 1, DOI 2) — no study exists at either address, and the references were fabricated.
Question for your doc
“Can you point me to the actual study you’re referencing?”
Doctor-written gold answer in OpenAI HealthBench (prompt ce5801ab) · Audit finding #3

Full audit + methodology open-source · github.com/borisdev/nobsmed-healthbench-audit

Or upload a MyChart export, after-visit summary, or care-plan PDF →

Private beta · No medical advice · Research and education only

© No B.S. Med, Inc. · About · Blog