Blog

The Evidence-to-Person Fit Problem


Evidence-to-Person Fit asks whether clinical-trial findings actually apply to the person, condition, intervention, comparator, outcome, and care context in question.

Just as Product-Market Fit measures how well a product matches its market, Evidence-to-Person Fit measures how well medical evidence matches the patient asking. A treatment recommendation is only useful if the evidence behind it applies to the person receiving it.

Why Medical Evidence-to-Person Fit Is Hard — and How We Address It

1. Clinically complete questions are hard to ask.

1.1 Problem:

Questions like “Does this treatment work?” are often incomplete. A clinically useful question looks like:

“In patients like this person, compared with this alternative, for this outcome, over this time window, what did the evidence show?”

Without the right patient factors, summaries miss clinical-trial details that affect efficacy, safety, eligibility, and applicability.

1.2 Solution:

Our deterministic MCP server is a structured query layer that returns consistent, inspectable answers from clinical-trial findings. AI agents — for patients, clinicians, or developers — ask evidence questions with the patient context preserved, instead of relying on opaque summaries that strip it away.

2. Summaries strip the slots that determine applicability.

2.1 Problem:

Even good questions fail when the AI has no slots to fill. Without structured fields — population, eligibility, interventions, comparators, outcomes, follow-up windows, subgroup constraints — an answer can cite a study while still hiding the details that decide whether the result applies.

2.2 Solution:

Our schema captures all of those slots, aligned with the ICH E9(R1) estimand framework and FHIR-aware for interoperability (repo in-progress).

3. Closed schemas hide how claims are structured.

3.1 Problem:

Even with slot definitions, if the schema itself is closed, users can't see what slots exist, propose new ones, or debate which slots matter. As your doctor's care decisions are more and more likely to be shaped by clinical AI (see the Medical AI Landscape), patients, clinicians, and AI agents need a transparent representation language for clinical-trial findings. By transparent we mean two things: the system cites exact study results, and the schema used to structure them is open, inspectable, and forkable.

3.2 Solution:

Our schema is public, inspectable, and forkable. Anyone can examine, reproduce, and improve how clinical-trial findings are structured.

Prior open-source work — TrialStreamer, EBM-NLP, AlpaPICO — extracts PICO into MeSH-mapped snippets. We add estimand structure and patient-applicability slots, so agents can query findings by the dimensions that determine whether evidence fits a particular person.

4. Parsing can hallucinate structured findings.

4.1 Problem:

Even with an open schema, the conversion from study text to structured findings can be wrong. Extraction can hallucinate subgroups, misrepresent outcomes, drop follow-up windows, or misalign comparators. Without external verification, those errors propagate into every downstream answer.

4.2 Solution:

We open-source the test harness (reproducible checks that extraction behaves correctly) and the test fixtures (parsed findings from real studies — the data model instantiated for each paper). Anyone can point to a specific study and say “this finding was parsed wrong,” propose a correction, or run their own extraction and compare.

Citations are provenance; fixtures and tests are accountability. The knowledge graph content and MCP runtime are operated by us; the schema, fixtures, and harness underneath are open. As AI systems increasingly mediate medical evidence, the evidence layer needs independent auditability.


Related: The Medical AI Landscape · Medical AI Developer Tooling · About