AST vs Vector vs Graph vs IR: Four Ways to Give an LLM Your Codebase
An AI coding agent cannot read your whole codebase. It reads what you feed it. So every tool in this space, whether it calls itself a copilot, a context engine, an indexer, or a code search product, is really answering one question. How do you turn a codebase into something an LLM can be handed?
There are only four real answers. You can parse the code into a syntax tree. You can embed it into vectors. You can build a graph out of it. Or you can compile it into a verifiable intermediate representation. Almost every tool you have heard of is one of these four, or a blend of the first three.
This is the post you read when you are evaluating tools and the marketing pages all start to sound the same. We are going to go through each representation, say plainly what it keeps and what it throws away, and then explain why the first three turn out to be features of the fourth rather than competitors to it.
The test we are going to use
To compare these fairly we need a single yardstick, so here it is. A good code representation has to do three things. It has to let the model find the right code when a developer asks a question in plain language. It has to preserve the relationships between pieces of code, because software is mostly relationships. And it has to carry intent, meaning the why behind the code, not just the what. There is a fourth thing that almost nobody talks about, which is whether you can check the code back against that intent on every change. Hold onto that one, because it is where the whole comparison turns.
Representation one: the AST
An abstract syntax tree is what you get when you parse code the way a compiler front end does. Tools like tree-sitter walk the source and produce an exact tree of every construct. This function, these parameters, this loop, this call, this import. It is precise, it is fast, and it is language specific.
What the AST keeps is structure, and it keeps it perfectly. If you want to know what a function calls, what a class contains, or what a file imports, the AST has the exact answer with no guessing.
What the AST throws away is meaning. It can tell you that retryCharge() exists and that it calls stripe.charges.create(), but it has no idea that retryCharge() is the answer to the question “how do we handle failed payments.” It sees the shape of the code and nothing underneath it. Ask the AST a question in plain English and it has nothing to say, because plain English is not in the tree. The AST is the foundation almost everything else is built on, but on its own it is a structural skeleton with no semantics.
Representation two: vector embeddings
Embeddings take a different bet. Instead of parsing structure, you chop the code into chunks and turn each chunk into a vector, a long list of numbers that places the chunk somewhere in a high dimensional space. Similar text lands in similar places. To answer a question you embed the question too and grab whatever sits nearby. This is what claude-context and Cody do, and it is what most people mean by semantic code search.
What embeddings keep is a fuzzy notion of topic. Ask for “user login flow” and you will get code that talks about users and logins, which is genuinely useful for a quick lookup and far better than grep when you do not know the exact names.
What embeddings throw away is both structure and meaning, and this is the part people underestimate. Two functions both called validate() from completely unrelated parts of the system embed almost on top of each other, even though they have nothing to do with one another. A utility and the service that depends on it can land far apart even though they are tightly coupled, because coupling is not word overlap. Cosine similarity is measuring how much two pieces of text resemble each other, and resemblance is simply not the same thing as connection. Code is logic, and two completely different looking pieces of code can do the exact same thing while two nearly identical ones do opposite things. Embeddings cannot tell the difference. They give you files that look related. They do not give you understanding.
Representation three: the code graph
The graph is the grown up version of the AST. You take all that exact structure the syntax tree gives you and you store it as nodes and edges in a graph database. Functions, classes, files, and the relationships between them. This function calls that one. This class inherits from that one. Then instead of searching by similarity you traverse. Follow the call chain downstream, follow the callers upstream, walk the type edges to the real implementation. This is code graph RAG, and it is what GitNexus, Sourcegraph with SCIP, code-graph-mcp, and CodeGraphContext are doing.
What the graph keeps is the thing vectors threw away, which is structure and connection. It nails multi-hop reasoning, the controller to service to repository chains that embeddings cannot follow because no single similarity hop links the ends. The research is clear that this works. RepoGraph reported 32.8% on SWE-bench, CodexGraph beat similarity only retrieval, and a January 2026 paper found AST derived graphs score highest on architectural queries at a fraction of the cost. If you are choosing between vectors and a graph, choose the graph.
What the graph throws away is the same two things the AST did, just at a larger scale. It still does not carry intent. The edge says processWebhook connects to updateUser, but the reason, which might be a compliance rule that the audit log has to be written first, was never in the syntax, so it is not in the graph. And the graph has no way to keep itself honest. It is something you extracted from the code, a shadow of it, and nothing checks the code back against it when either one drifts. Great as an index. Useless when you need to catch a change that quietly broke the contract.
Representation four: the verifiable IR
Now the fourth option, and the reason the first three are features rather than rivals.
Compilers never settled for a syntax tree or a lossy summary. They lower a program into an intermediate representation, a structured form that keeps everything that matters about what the program means, that you can analyze and transform, and that you can check every later change against. Every serious code transformation in the world passes through one. A verifiable IR for code context is that same idea pointed at AI. Not a thing you passively extract from your code, but a derived contract your code is continuously checked against.
What the IR keeps is everything the other three keep, plus the two things they all dropped. It contains the structure the AST captured. It contains the connections the graph captured. You can still do similarity lookups over it the way embeddings let you. But on top of that, every unit carries meaning, recording what it does, why it exists, and what business purpose it serves, which is the intent that was never in the syntax. And it is verifiable. Code compiles into the IR, and from then on every change gets checked back against it. Code, then a verifiable IR, then every change checked against it. Because the intent is written down in one place, you can derive a candidate spec a module implements, flag a patch that drifts from what the module is meant to do, and catch a hallucinated dependency before it lands instead of after.
Be honest about what that does and does not buy you. A layer derived from code cannot also BE the source of truth, and that is the trap every tool that promised a perfect round trip back to code fell into. The IR does not prove your code is correct. It is a contract the code is continuously checked against, which is a weaker promise and a far stronger result. As machines write code faster than any team can read it, the only thing worth trusting is a clear statement of what the system must do, with continuous proof it does that and nothing more.
That is why these are not four competitors. The AST is the parse step the IR is built on. The graph is the structural view the IR exposes. Vector similarity is one way to search the IR. The IR is the category. The other three are things it does.
The honest table
| Find by plain question | Keep structure | Carry intent | Verify against intent | |
|---|---|---|---|---|
| AST | no | yes | no | no |
| Vectors | roughly | no | no | no |
| Graph | yes, by traversal | yes | no | no |
| Verifiable IR | yes | yes | yes | yes |
Read down the last column. Only one representation lets you check every change back against what the code is supposed to do, and that single property is what separates an index you query from a substrate you can trust.
What this means when you are evaluating tools
When you read a product page, figure out which of the four it actually is, because that tells you its ceiling before you ever run a trial. If it embeds and reranks, it is vectors, and it will plateau on anything that needs real structure. If it parses and traverses, it is a graph, and it will plateau the moment you need the why behind the code or you want to catch a change that quietly violated it. Both are real improvements over grep and both are worth using. Neither lets the model reason over meaning and verify a change against it.
It is worth being clear about who else is in the room. Spec tools like Tessl, Spec Kit, and Kiro start from a blank page, which is great for new work and silent on the system you already run. Assistants like Augment build a layer like this and keep it to themselves. ByteBell derives a verifiable layer from the code you already have, brownfield first, and hands it to every tool you use rather than locking it inside one.
This is what ByteBell is. We are the verifiable context layer for code. We run the LLM compiler pattern, compiling every file once into a verifiable code IR that carries purpose, business context, and cross repository relationships, with the structural graph stored right beside it, all on your own infrastructure through Docker at a few dollars per thousand files. Your code never leaves your environment. Every engineer queries the same representation through one MCP url, on any copilot, and every agent edit gets checked against that IR before it lands so drift and hallucinated dependencies get caught instead of shipped. On 46 repositories and 150,000 files it delivered about 10% higher accuracy at 70% lower cost, on roughly a fifth of the tokens, and it was the only approach that finished the cross repository tasks where graph and vector tools stalled out.
AST, vectors, and graph are four ways to describe your codebase to a model. A verifiable IR is the one that also lets the model check every change back against what the code is meant to do.