Knowledge Base Chunking Is Product Design
If your agent retrieves the wrong context, the model may still write a beautiful answer. That is the problem.
Retrieval-augmented generation often gets discussed as infrastructure: embeddings, indexes, vector stores, rerankers, and latency. Those pieces matter. But many retrieval failures start earlier, at the moment a messy document is chopped into chunks that no longer resemble how a human would use the source.
A policy is not just words. It has sections, exceptions, dates, owners, and dependencies. A troubleshooting guide has steps. A contract has clauses. A sales playbook has examples and caveats. If chunking destroys that structure, the agent may retrieve a fragment that looks relevant but lacks the condition that makes it true.
Billing policy
Chunk 1 · source tagged · updated Apr 2026
Refund exceptions
Chunk 2 · source tagged · updated Apr 2026
Account tiers
Chunk 3 · source tagged · updated Apr 2026
Chunk by meaning before size
There is no universal chunk size that fixes retrieval. A short FAQ answer may be a complete chunk. A refund policy may need the rule, exception, and approval threshold together. A procedure may need one chunk per step, with the prerequisite attached. The right boundary is the smallest unit that can answer a user intent without losing the source logic.
Start with semantic boundaries: headings, clauses, checklist items, procedure steps, table rows, and examples. Then check whether each chunk can stand on its own. If a chunk begins with "this exception" and the exception is no longer attached to the rule, the chunk is too small or missing metadata.
Overlap is a tool, not a ritual
Overlap helps when meaning crosses a boundary. It hurts when it floods retrieval with near-duplicates. Too much overlap can make the top results look consistent while hiding the one chunk that contains the actual answer. Use overlap where continuity matters: long procedures, multi-part clauses, or definitions that are referenced across nearby sections.
Metadata makes retrieval controllable
The most useful metadata is boring: source name, owner, version, published date, product area, customer segment, region, and content type. Those fields let the workflow filter before retrieval or rerank after retrieval. They also help a reviewer understand why an answer was grounded in a given source.
- Source metadata tells the agent where the answer came from.
- Version metadata prevents old policies from winning over current ones.
- Domain metadata keeps legal, billing, support, and product knowledge from blending together.
- Owner metadata gives teams a path to fix bad source material.
Evaluate retrieval separately from generation
If an answer is bad, teams often blame the model. First ask whether the right chunks were retrieved. A retrieval eval should test whether the top results include the source a human would use. Only after retrieval is healthy should generation quality become the main question.
In Trumpets, knowledge sources are part of the agent operating system: they connect to workflows, prompts, benchmarks, and review gates. That means chunking is not a back-office ingestion detail. It is product design for how an agent knows what it is allowed to say.