Skip to content

The Model Has No .plan

One thread runs through all thirteen posts in this series: context. Not the model's context window. The accumulated understanding of a specific system, team, and problem space that determines whether AI is genuinely useful or just fast.

Context is the ceiling.

Every failure pattern I described traces back to this. When AI falls apart on a mature codebase, it's usually not a model failure. It's a context failure. The model doesn't know why that component works the way it does, what implicit rule this service encodes, or what breaks when you touch that one corner of the system. That knowledge lives in people's heads, accumulated over years of decisions that didn't feel like decisions at the time.

On greenfield projects, the gap is small because there isn't much context yet. That's where the dramatic productivity stories come from. On mature systems, the gap is large, and prompting harder doesn't close it.

The investment that pays compound returns isn't better prompts. It's making context explicit: documented conventions, clear architectural invariants, tests that encode intended behavior. AI with deep context performs like a different tool than AI without it.


You are the compiler operator.

Framing AI as a compiler for intent clarifies the operator's job. A compiler doesn't write programs; it translates them. What you put in determines what comes out. The engineers getting the most out of AI aren't writing more prompts. They're building better compiler targets: files that encode how the system works and why, constraints defined before the session starts. Starting to generate without that context doesn't feel slower. It is. You spend more time steering back than you saved by jumping in.


Slop happens when context is skipped.

The failure mode that follows is acceptance without understanding. Ask the model for a solution, get something plausible, move on. Done once, fine. Done repeatedly across a codebase, you get architecture nobody designed. Each change is defensible in isolation. The tenth defensible change in a row produces something nobody can explain, because nobody ever decided it should work that way. With AI, the volume is higher and the outputs look clean. The degradation is architectural, not syntactic.


The ratios shifted. The proof requirement didn't.

Code generation is the 20% that got faster. Testing, review, validation, debugging — the 80% — is still there. Speed up one part of a pipeline and you don't accelerate the whole thing; you find out where the next constraint lives.

Faster output raises the stakes for proof. The teams that made the gains real didn't treat generation speed and validation rigor as a tradeoff. Both moved together, or the gains didn't stick.


The model doesn't think. You do.

Fluent output looks like thinking. It reads like a decision was made. The model generated the most probable continuation of your prompt. The judgment that should have come first didn't happen; it got skipped.

Plenty is worth delegating: scaffolding, boilerplate, the mechanical parts of implementation once you know what you want. What's worth keeping is the thinking that precedes all of that. What should this system do? What are the constraints? What does good look like here? Those questions have to be yours, answered before the model gets involved. Otherwise you're not engineering. You're approving a stochastic process.

The developers getting the most from these tools haven't figured out how to offload the most. They've figured out which parts of their work require them specifically, and they protect it.


Judgment is the durable moat.

Syntax fluency is being commoditized. What replaces it as the differentiator is harder to learn and harder to fake: knowing what to build, knowing when something's wrong, knowing when to stop. Taste in the engineering sense, an internalized sense of what good looks like that operates faster than explicit reasoning.

That's the product of experience, of building things, breaking them, and paying attention to what failed and why. In a world where everyone has the same generative tools, the differentiator isn't what you can produce. It's what you can see.


Context takes time to build.

Which is why the pipeline problem matters more than it's getting credit for. If AI compresses entry-level work, the engineers who were supposed to accumulate context over their first years of productive struggle may not get those years. Learning to code involves a kind of failure that's hard to shortcut: hitting a problem you don't understand, forming wrong hypotheses, testing them, building the mental models that let you recognize when an answer is wrong.

Stop producing engineers who go through that process and you eventually stop having them. AI can accelerate an experienced engineer's output. It can't manufacture the years of context, failure, and judgment that make someone experienced.


The model's context window is a technical constraint that improves with every release. Yours doesn't improve on its own. The model doesn't have a .plan. It doesn't know what you're building toward, what you're optimizing for, or what you learned the last time something like this broke. That's yours. It's what makes the tool useful, and it's what the tool can't give you.


Part 14 of 14 — What I Think About AI Engineering**

← The Junior Developer Pipeline Problem