Claim
Agentic AI is compressing production cost at the same time verification cost is rising. METR measured 1.5-13x time savings for technical staff using agentic AI. But output volume growing faster than verification capacity means judgment, not production, is the new bottleneck.
Mechanism
Production cost falls because AI handles generation. Verification cost rises because AI output volume grows while human evaluation time stays fixed. The ratio flips: you produce more than you can reliably assess. Judgment is the only thing that does not compress.
Conditions
Holds when: AI generates enough volume that human reviewers are the constraint. True for most knowledge work at current model capability levels.
Fails when: Automated evals can substitute for human judgment on the specific output type. Where verification can be mechanized, the constraint shifts back to production.
Evidence
Indig references METR research: 1.5-13x time savings, 40% cost reduction, 60% time reduction as realistic benchmarks for technical staff with agentic AI. His synthesis:
"Judgment is the only thing that doesn't compress."
Signals
- Output queues are longer than review queues can clear
- You have more drafts, more code, more research than you can act on
- The bottleneck in any AI-native workflow is approval, not generation
Counter-evidence
For tasks with objective correctness (unit tests, factual lookups with external verification), automated evals can substitute for human judgment. The judgment-scarcity claim is strongest for tasks where quality is subjective or contextual. The METR research covers technical staff; transferability to other domains is unverified.