Claim
Each new model release moves capabilities discontinuously, what was impossible becomes easy, and the size of leaps grows each cycle, so AI workflows must be pinned to capability targets you re-baseline quarterly, not to one model snapshot whose performance you optimize against.
Mechanism
A workflow optimized for the current model's exact behavior (its prompt patterns, context-window tricks, output shapes) accumulates dependencies that break when the next model arrives. The next model often makes the workaround unnecessary, the capability is now native, but the workflow has hardened around the workaround. Re-baselining quarterly means: define the capability target (e.g., "produce a credible 15-page strategy doc with primary research"), test it against the current model, then strip workarounds whose justification disappeared. Workflows pinned to capabilities outlive their underlying model snapshots; workflows pinned to a snapshot decay.
Conditions
Holds when:
- The team uses AI tooling for a sustained workflow (not one-off prompts).
- Model release cadence is fast enough that quarterly re-baselining catches real capability shifts.
- The team has the discipline to strip workarounds rather than accumulate them.
Fails when:
- The workflow has hard regulatory constraints that lock it to one model version.
- Re-baselining cost exceeds the value of capturing each leap.
- The team can't measure capability vs workaround coupling and re-baselines reflexively without removing dead config.
Evidence
"Every few months a new model arrives... something that was impossible becomes easy, while the size of the leaps grows each new release cycle."
Honest concession: rough edges remain in long-form fiction generation. The frame applies to capability-led workflow design, not blind upgrade.
· Ethan Mollick, Sign of the Future: GPT-5.5, https://www.oneusefulthing.org/p/sign-of-the-future-gpt-55, 2026-04-23
Signals
- A standing quarterly re-baseline task strips workarounds when their justification disappears.
- Workflow specs name the capability target separately from the model used to deliver it.
- New-model arrivals trigger a structured eval pass, not panic re-tooling.
Counter-evidence
Frequent re-baselining costs operator time and can destabilize workflows that depend on consistent behavior. For high-stakes regulated workflows, pinning to a known model snapshot is the safer move. The cadence (quarterly) is a default, not a universal.
Cross-references
- Build for the model six months out, the current model will eat your scaffolding, Anthropic's parallel framing.
- When a new model lands, re-read the system prompt and remove crutches, companion: strip the workaround when the model catches up.
- Treat `.claude/` as a deployable artifact with versioning and rollback, Huryn's config-layer hygiene that makes re-baselining tractable.