We Want to Grade the AI. Did We Grade the Doctor?

In Utah, an AI system called Doctronic is renewing prescription medications without a physician in the loop. It covers about 190 commonly prescribed drugs for patients who already have an existing prescription from a human clinician. The cost is $4 per refill.

The reaction has been swift. The AMA warned that accuracy claims don't replace clinical judgment. The American College of Physicians said prescriptions shouldn't be issued without physician involvement.

Most recently, Hopkins trauma surgeon Joseph Sakran and medical student Rahul Gorijavolu‍ ‍wrote in the Washington Post that the program's safety case rests on a single unreviewed preprint — authored entirely by equity holders, with underlying data unavailable to outside researchers.

They're right about that. Doctronic's evidence is thin. The study hasn't been peer-reviewed. The data hasn't been shared. The 99.2% accuracy figure everyone cites was derived from urgent care encounters, not the chronic medication renewals Doctronic is actually doing in Utah.

All fair criticisms. But here's the question nobody is asking:

What's the evidence base for the process Doctronic is replacing?

Prescription renewals are a big part of everyday medicine. In many practices, they move through a fast, pieced-together workflow: a pharmacy request, a portal message, a medical assistant or nurse using a protocol, a chart routed to a covering clinician, a physician clicking "approve" between visits. The details vary from place to place, but the overall picture is familiar. This is high-volume outpatient work done under time pressure.

Has that process been studied? Yes, but not in any single decisive way. There is no landmark trial that established the ordinary human prescription-renewal process as a gold standard. There is no widely accepted national error rate for physician refill approval. What exists instead is a scattered literature on refill workflow, medication discrepancies, prescribing and monitoring errors, pharmacist interventions, and refill protocols designed to catch the things ordinary practice often misses.

A few examples:

What that literature shows is that the standard renewal process has a lot of chances to miss things: missing labs, outdated medication lists, overdue follow-up, drug interactions, medications that should have been adjusted, and medications that should have been stopped. It is a human process that usually works, sometimes misses important things, and has been made safer over time with protocols, pharmacy review, and extra layers of checking.

That matters because the case against AI prescription renewal is often argued as though the human alternative is already a proven standard. It is not. It is the existing standard — which is a different thing.

Here’s the question nobody is asking: what’s the evidence base for the process Doctronic is replacing?

The Washington Post op-ed also points, reasonably, to the possibility that some medical AI tools may perform differently across demographic groups. Fine. That should be examined. But notice the asymmetry. When AI enters medicine, we ask for subgroup analysis, prospective evidence, peer-reviewed validation, and proof that it works. We rarely ask for the same level of evidence about the human workflow it is being compared against.

That is the real point. The standard human renewal process is not some rigorously validated benchmark. It is an old, familiar process, used at enormous scale, with known weak points and a thinner evidence base than most people would assume for something so common and so important.

This is the double standard at the center of the AI-in-healthcare debate right now. We are applying a level of evidentiary scrutiny to AI that we have never applied — and show no signs of applying — to the human processes AI is replacing.

I'm not arguing that Doctronic should get a pass. It shouldn't. The evidence should be better. The preprint should be peer-reviewed. The data should be available. The study population should match the deployment population.

But if the standard is "rigorous, independent, peer-reviewed evidence of safety before deployment" — and it should be — then that standard has to apply to both sides of the comparison. You can't demand a randomized controlled trial from the AI while grandfathering in the physician on the basis of tradition.

The authors of the WaPo op-ed write that "the window to act is this year, before autonomous AI prescribing expands." They're right about the urgency. But the most useful thing we could do with that window isn't just to test the AI. It's to test both — and find out what we've actually been living with.

report card of human vs AI prescription renewals

→ Follow new essays via RSS or on LinkedIn.

Next
Next

When Enterprise Health AI Makes a Mistake, Who’s Liable?