UnitedHealth, nH Predict, and What a 90% Reversal Rate Proves

Ola Kolade·May 24, 2026

A Medicare Advantage member has a stroke and lands in a skilled nursing facility. The treating physician lays out a recovery plan that needs weeks of post-acute care. Within days, the insurer's system fires off a notice cutting that coverage off. Nothing about the patient's actual progress drove it, not the physician's read, not the facility's notes. A prediction drove it: a model trained on how long patients with this diagnosis usually need care has decided this one hit the limit. The family appeals. A person reads the clinical record. The denial is reversed.

That sequence played out thousands of times across UnitedHealth's Medicare Advantage plans between 2020 and 2023. The model, nH Predict, is run by NaviHealth, a UnitedHealth subsidiary bought in 2020. It takes diagnosis codes, procedure histories, and demographics and predicts how long a post-acute patient should need to stay. When the real stay runs past the prediction, the system flags the case to cut coverage. The complaint in Estate of Gene B. Lokken v. UnitedHealth Group, filed in November 2023 in the District of Minnesota, says more than 90% of those terminations were reversed on internal appeal or federal review. Several of the named plaintiffs are the estates of patients who died during the coverage fights.

The 90% is the center of the case, so it's worth being careful about what it does and doesn't show. It doesn't show that nH Predict was badly calibrated. A population recovery model can be accurate across a big cohort and still spit out the wrong answer for a particular patient whose situation doesn't match the average, a stroke patient with other conditions stacked on top, a surgical patient who picks up a post-op infection, an older patient whose recovery just doesn't follow the curve the training data expected. None of that is model failure in any technical sense. It's the ordinary distance between a population average and one real person. The model was never built to make the call on an individual; it was built to predict typical recovery windows. What matters is what happened between the prediction and the coverage decision.

What the 90% reversal rate does show is that whatever sat between the model's output and the termination notice wasn't catching what a later human review almost always caught. If a real clinical review, one that reads the treating physician's notes, looks at the patient's current condition, and applies Medicare Advantage criteria to that specific person, overturns the call nine times in ten, then the original call and a genuine review almost never land in the same place. That isn't a calibration gap. It's what an absent reviewer leaves behind. When two reviews of the same case disagree nine times out of ten, one of them isn't really happening.

The contract makes it sharper. UnitedHealth's Medicare Advantage plan documents put coverage determinations in the hands of clinical staff and physicians, and the wording is exact: not that the decisions would be informed by clinical staff, not that algorithms would help physicians reach them, but that clinical professionals would make them. Judge Tunheim's February 2025 ruling let the breach-of-contract and good-faith claims move forward on that footing: if nH Predict pushed the promised clinical review aside instead of backing it up, that's a breach of the plan's own terms. The theory needs no AI-specific law. It's plain contract law, aimed at the distance between what the documents promised and what the deployment did.

The discovery order that came next, in March 2025, hints at how much those internal records may hold. The court ordered UnitedHealth to hand over nH Predict's development and deployment history back to January 2017, its policies for post-acute care claims, the training materials given to care coordinators and medical directors, performance reviews and compensation data for the staff making coverage calls, and the files of its internal AI review board, members named. An order that wide is what a coverage dispute turns into once it becomes a dispute about process. The fight isn't over whether one patient should have gotten more care. It's over whether the review UnitedHealth's own documents describe ever existed in a form anyone would recognize as clinical judgment rather than a rubber stamp on the algorithm.

nH Predict is one instance of something showing up across the country. Prior authorization AI is now under active legislative scrutiny in several states. Washington's SB 5395, effective in 2026, bars health carriers from leaning solely on AI to deny, delay, or limit care and requires that adverse calls come from licensed professionals. Alabama's SB 63, effective October 2026, makes insurers certify every year that their AI tools don't swap group datasets in for a patient's individual circumstances. Maryland's HB 1563, effective June 2026, makes insurers report adverse decisions to the Insurance Commissioner each quarter, AI involvement included. The National Association of Insurance Commissioners started a 2026 pilot of its AI Systems Evaluation Tool to gauge how insurers govern and test their systems for bias and accuracy. All of it points the same way.

But Lokken doesn't lean on any of that new law. It rests on contract and the implied covenant of good faith. The plan said physicians would decide. The deployment slid a population prediction model into their place. The appeal record shows how far apart the two run. That setup was available before any state passed an AI healthcare statute, and it reaches any insurer whose plan documents describe a clinical review its AI has quietly taken over.

The Prenuvo case rhymes with this one. Prenuvo's exposure comes from the gap between what it marketed, AI-assisted detection of hundreds of conditions, and what its record holds, a one-page report with no trail of the review behind it. UnitedHealth's gap is the same shape: physician-led coverage decisions on paper, an algorithm cutting off care from population statistics in practice. Each company's own documents set the bar its deployment is then measured against, and in each, the missing record of what review actually happened is what leaves the gap impossible to defend.

The override rate problem plays out differently here than in lending or hiring, and the difference is the whole point. In lending, a low override rate cuts two ways: maybe the model is accurate, maybe the reviewer is rubber-stamping, and the number alone won't say which. In prior authorization, the appeal reversal rate breaks the tie. It supplies the comparison the first pass never had. When a real second review that actually weighs the individual case comes out the other way 90% of the time, the first pass has been held up against genuine review and shown to produce systematically different results. There's nothing ambiguous left. It's about as close to a controlled experiment as a liability record ever gets: same patient, same clinical facts, one process that ends coverage and one that restores it, run thousands of times with a 90% disagreement rate. A sibling system, Cigna's PxDx, threw off the same kind of evidence from the other end, in an internal scorecard logging an average of 1.2 seconds of physician time per denial.

UnitedHealth's defense will come down to the claim that clinical staff were involved at every step and that nH Predict only supported the decision, never made it. Whether that holds up against the discovery record is what the trial decides. The appeal reversal rate is the number the defense has to explain away, and any explanation has to square how a process built on genuine clinical review could keep producing calls that a later clinical review throws out almost every time.

Proof of Review captures what the nH Predict record can't: whether a clinician ever opened the individual patient's file, what in it informed the decision, and how long that took, on the determination itself rather than after the appeal.

←All posts