EngineeringApril 9, 2026·5 min read

Human-in-the-loop, by design — not as a fallback

By Aghasi Gasparyan

Most 'human-in-the-loop' implementations I've reviewed are escape valves — a place to bail out when the model's wrong. Ours are routing decisions the system makes on every step, and they belong to the orchestrator, not the model.

The difference looks small on a whiteboard. In production it's the gap between a system that works and one that gets shut off in week six because operators stopped trusting it.

Three places humans belong

We've shipped enough document-heavy workflows now that the pattern has stabilized. Humans go in three places, and almost nowhere else.

1. Ambiguous classifications

When the model's confidence on a classification is below the threshold you've calibrated against your real data — not the model's self-reported confidence, which lies — route to a human. Don't ask the model to pick anyway. The cost of a wrong classification compounds downstream; the cost of human review on the 4% of ambiguous cases is bounded and small.

2. Novel exceptions

When the system encounters a shape of input it hasn't seen — a new document type, a counterparty whose data layout doesn't match anything in your history — escalate. The temptation is to let the model 'try its best.' Don't. Novel inputs are exactly the cases where models hallucinate confidently. They're also where you learn what to add to the system, so the next time it sees that shape it routes correctly. Every novel exception is a free training signal.

3. Irreversible writes

Anything that moves money, signs a contract, sends a binding communication, or writes to a system of record where the audit log matters — human signs off. Always. Not because the model is necessarily wrong, but because the asymmetry is wrong. A confident wrong write costs you weeks; a 30-second human approval costs you nothing. The expected-value math is one-sided.

Where it goes wrong

The biggest failure mode isn't bad classification — it's alert fatigue. If your system routes 40% of cases to humans, your operators stop reading and start rubber-stamping. Within a month, the human-in-the-loop step is theatre.

The fix is calibration. The threshold for 'route to human' isn't a hyperparameter you set once. It's a number you tune monthly against the rate of human disagreement on routed cases. If humans are agreeing with the model 99% of the time on what gets routed, the threshold is too sensitive and you're wasting their attention. If they're disagreeing 30% of the time on what doesn't get routed, the threshold is too loose and you're missing real errors.

We tune this against real production data on every engagement, and we name the operator who owns the calibration cadence. If no one owns the threshold, it drifts.

The orchestrator owns routing, not the model

Last point — and this one is architectural. The decision of whether to involve a human is made by the orchestrator, not the model itself. The model produces a structured output: a classification, a set of extracted fields, a confidence vector. The orchestrator decides what to do with it.

Why this matters: if the model is making routing decisions, you can't change routing logic without retraining or re-prompting. If the orchestrator is making routing decisions, your operations team can adjust thresholds in a config file at 3pm on a Tuesday. That difference is what 'human-in-the-loop by design' actually means in code.

Adjacent

Have a workflow that fits the patterns above?

Thirty minutes, no slideware. We'll tell you honestly whether AI fits and where it doesn't.

Book a working session →