Spec-Driven Development for Brownfield: Verify Code Against Intent With a Verifiable Code Context Layer

Specs and code drift apart the day after you write the PRD, because they are two separate documents that nobody keeps in sync. A verifiable context layer derives the spec a codebase actually implements and then continuously checks the code against it. Here is how spec-driven development works on a brownfield codebase, and why it needs a verifiable code IR that checks code against intent.

Spec-Driven Development for Brownfield: Verify Code Against Intent With a Verifiable Code Context Layer

Spec-Driven Development for Brownfield: Verify Code Against Intent With a Verifiable Code Context Layer

Every team has lived this. Someone writes a PRD. Engineers build from it. The day after the build starts, the two begin to drift, because the spec is a document in one place and the code is the truth in another, and nothing keeps them honest with each other. Six months later the spec describes a product that no longer exists and the code is the only real record of intent, except the code does not actually state the intent, it just enacts it. The why lives in someone’s head, or in a Slack thread, or nowhere.

Most spec-driven development tools answer this by starting from a blank page. You write the spec, they generate the code. That is fine for a greenfield project that does not exist yet. It does nothing for the millions of lines you already shipped. The interesting problem is not “spec first, then code.” It is brownfield: you already have the code, you never wrote down what it was meant to do, and you need a way to state the intent and keep the code honest against it from here on.

That is what a verifiable context layer makes possible. Not a magical two-way machine that rebuilds your code from a document, but a derived, continuously checked contract. You read the code once, derive a candidate spec from what it actually does, and from then on every change gets verified against that spec before it lands. The mechanism behind it is worth understanding, because it is the same property that makes everything else about a verifiable context layer hold up.

Why spec and code drift in the first place

The drift is not a discipline problem you can fix with better process. It is structural. A spec is intent expressed in prose. Code is intent expressed in execution. The translation from the first to the second is lossy and one way. When you build code from a spec, the prose intent gets compiled, by humans, into logic, and the original intent is not stored anywhere in the result. It was used and discarded, like scaffolding.

So the code cannot tell you the spec it implements, because it never kept the spec. And the spec cannot track the code, because it has no link to it. The two drift because there was never a connection between them, only a one-time human translation that left no trace. Documentation rots for the exact same reason a plain code graph cannot tell you whether a change still matches intent. The information that would tie them together was thrown away at the moment of building.

What spec-driven development on brownfield actually requires

To stop the drift you have to be able to do something that sounds almost impossible. You have to derive the intent from the code. Read a service and produce a candidate specification of what it actually implements, not a description of its syntax but a statement of what it is for and why it is built the way it is. Then you have to keep checking the running code against that statement so the two stay tied together instead of wandering off.

A parser cannot do this, because intent was never in the syntax. A code graph cannot do this on its own, because it only kept the symbols. The only thing that can read code and surface the purpose underneath it is a model, and even a model doing it live, in a chat window, on raw files, does it unreliably and forgets it the moment the session ends. To make a derived spec durable and trustworthy, you have to capture it once, into a representation built to hold it, and keep it linked to the code it came from so you can verify against it on every change.

That is exactly what a verifiable code IR is. A model reads the codebase once and lowers it into a representation that captures, for every unit, what it does and why it exists and what business purpose it serves. The intent that was discarded when the code was built is read back out and stored as a contract, linked to the precise code that enacts it. The spec is no longer a separate document. It is a view of the IR, and the IR is the thing the code is continuously checked against.

How the spec stays in sync, both directions

Once intent is captured in a verifiable representation, spec and code stop being two artifacts you maintain by hand. They become one contract, read two ways.

Going one way, code to spec, the IR already holds the derived intent, so you can read off the specification a service implements at any time, current and true, because it was derived from the code as it actually is rather than written once and abandoned. The spec stops drifting because it is generated from the code, not maintained alongside it.

Going the other way, intent to code, you do not push a button and watch the system rebuild your service from a paragraph. You edit the intent in the representation, change what a thing is supposed to do, and that becomes the new contract. Every agent edit and human change that follows is then verified against it. A representation that still knows all the cross-repo connections and the surrounding logic can catch the change that would break a consumer three repositories away, before it ships, because it checks the proposed code against the stated intent and the dependency graph rather than trusting it blind.

That is the honest version of “both ways.” It is not a lossless round-trip where a document becomes a binary and back. It is one derived contract that you can read as a spec, edit as intent, and check every change against. Be clear about the trap here, because every “round-trip code” tool fell into it: a layer reverse-engineered from code cannot also BE the source of truth. The IR is a contract the code is continuously verified against. Weaker promise, far stronger result.

Why this needs a verifiable layer specifically

It is worth being clear about why a normal context graph cannot do this, because it is tempting to think any rich-enough index would. The code-to-spec direction alone needs the representation to hold intent, which a symbol graph does not. But keeping the two in sync needs more than holding intent once. It needs the representation to verify new code against that intent on every change, and that is verifiability, the property a one-way extraction lacks. A context graph that captured intent but could never check a change against it would let you read the spec and never tell you when the code stopped matching it. You would have a snapshot of intent and still be back to manual drift the next day. Spec-driven development on brownfield needs both halves, intent derived and held, and the ability to verify each change against it, and only a verifiable code IR has both.

This is also where the field splits. Spec tools like Tessl, Spec Kit, and Kiro start from a blank page, which means they help the greenfield project and skip the codebase you actually have. Assistants like Augment keep the layer to themselves, locked inside one product. ByteBell derives a verifiable layer from the code you already have, and hands it to every tool. Brownfield first, infrastructure rather than an assistant, and your code never leaves your perimeter.

How ByteBell does it

ByteBell is the verifiable context layer for code, and spec-driven development on brownfield falls straight out of how it is built. We run the LLM compiler pattern, lowering every file once into a verifiable code IR that captures intent, business context, and cross-repository relationships. Index once, query forever, at roughly 7to7 to13 per 1,000 files in the DeepSeek style. It runs on your own infrastructure through Docker, which matters more here than almost anywhere, because the representation now holds the derived intent of your business logic, and that is exactly the thing you cannot let leave your perimeter. Every engineer and every agent queries it over one MCP url, on any copilot, and gets back exactly the relevant intent and code instead of re-reading thousands of files.

Because the IR holds intent and links it to the source, ByteBell can produce the specification a service actually implements, with citations to the exact files and lines that enact it, and then verify an intended change against that contract and the connections it would otherwise break. Per-file SHA-256 diffing means only what changed gets rechecked, so verification stays cheap on every commit. On 46 repositories and 150,000 files, about 8GB, it delivered roughly 15 to 20% higher accuracy at about 70% lower cost and 70% faster, on roughly a fifth of the tokens, and it was the only approach that finished the cross-repository tasks. That is the same capability that makes a spec edit get checked safely across repos instead of breaking the consumer you forgot about. The research backs the shape of this: RepoGraph (ICLR 2025) lifted SWE-bench resolution to 32.8% by giving agents repository structure, and CodexGraph (NAACL 2025) showed graph-grounded retrieval beating flat search.

Spec and code drift because intent gets discarded at build time and never written down. A verifiable context layer derives it once, keeps it linked to the code, and checks every change against it. Especially as machines write code faster than any team can read it, the only thing worth trusting is a clear statement of what the system must do, with continuous proof the code does that and nothing more. That is what it means for spec and code to finally stay the same thing, read in two directions and verified on every change.

This is ByteBell, the verifiable context layer for code: derived from the codebase you already have, checked on every change, served to every tool over one MCP url, on your own infrastructure.

www.bytebell.ai

← All posts