Why high-quality RAG starts long before the first question

Blog

If you’re building, evaluating, or buying a Retrieval-Augmented Generation (RAG) system and finding that answer quality is inconsistent, incomplete, or hard to trust, the problem is rarely the model.

Most conversations about Retrieval-Augmented Generation (RAG) focus on what happens after a user asks a question.

Which model should we use?
How do we rank results?
How do we improve answer quality?

Those are important questions, but they are not where high-quality RAG systems are truly made.

In our experience, the real determinants of RAG quality are set long before the first question is ever asked. They live upstream, in decisions that are often invisible once a system is running: how documents are prepared, structured, segmented, and loaded into the knowledge base in the first place.

This is where many RAG systems quietly succeed or quietly fall short.

The illusion of “Automatic RAG”

Modern AI platforms make it easy to “just upload documents” and start asking questions.

From a usability perspective, this is genuinely powerful. From a quality perspective, it can be misleading.

When ingestion is treated as a black box, critical trade-offs are hidden:

How much context does each chunk really contain?
Are sections being split in ways that preserve legal, procedural, or narrative meaning?
Are large documents dominating retrieval results at the expense of smaller but equally important ones?
Are tables, schedules, and structured content preserved or fragmented?

These questions don’t show up in a demo. They tend to surface months later, when users start losing confidence in the answers.

Why ingestion is not a simple preprocessing step

At SnapInsight, we treat ingestion as a first-order design problem, not a setup task.

Chunking, in particular, is often misunderstood - even by experienced teams. It’s easy to think of it as choosing a number - “1,000 tokens”, “2,000 tokens”, or similar. In reality, chunking is a multidimensional optimisation problem that sits at the intersection of:

Document structure (headings, clauses, schedules, appendices)
Semantic coherence (what information belongs together)
Retrieval competition (which documents crowd out others)
Token economy (how much context is consumed per answer)
Fairness and coverage across a corpus

A chunk that is “too small” may be precise but lack context.

A chunk that is “too large” may be comprehensive but dilute relevance and dominate retrieval.

There is no universal “right” answer only context-sensitive trade-offs that need to be understood and managed.

Why we simulate instead of guessing

Rather than relying on rules of thumb, SnapInsight uses simulation to explore these trade-offs before ingestion.

At a high level (without exposing proprietary detail), this involves:

Analysing the internal structure of each document: headings, depth, tables, numeric density, and layout signals
Generating multiple plausible chunking strategies per document
Simulating retrieval behaviour under realistic constraints (e.g. top-k scarcity, token budgets, document competition)
Measuring outcomes such as:
- Coverage (which documents are actually retrievable)
- Dominance (whether a few documents crowd out others)
- Token efficiency
- Structural integrity (are tables, clauses, and schedules preserved)

Crucially, this simulation happens before any language model is involved.

The goal is not to optimise answers, it’s to ensure the conditions for good answers exist in the first place.

Why this matters for organisations that care about quality

For organisations operating in regulated, technical, or high-stakes environments, mediocre RAG isn’t just inconvenient - it’s risky.

Poor ingestion decisions can lead to:

Confident but incomplete answers
Missing edge cases buried in large documents
Inconsistent responses depending on how a question is phrased
Erosion of trust in the system over time

These are not model problems. They are engineering and design problems.

Looking beyond ingestion

Ingestion is only one example of where upstream discipline matters.

The same philosophy applies across the RAG lifecycle:

Retrieval logic
Ranking and filtering
Answer synthesis
Evaluation and monitoring

As models become more capable, the differentiator is no longer raw intelligence - it’s how carefully the system around the model is designed.

Our roadmap continues to extend simulation into areas like:

• Faithfulness (does the answer reflect source intent?)

• Relevance under ambiguity

• Sensitivity to structural and semantic edge cases

This is slow, deliberate work - and that’s intentional.

Why we take the harder path

There will always be tools that promise faster, cheaper, more automatic RAG.

For some use cases, that may be enough.

But for organisations that care deeply about answer quality, reliability, and long-term trust, the details matter, even when they’re invisible.

At SnapInsight, we choose to stay close to those details.

We invest in understanding the mechanics, simulating the outcomes, and tuning systems with intention.

Not because it’s easy but because it’s how dependable, trustworthy AI systems are built.

If you are interested in this topic, we highly recommend reading our article What "good ingestion" actually means in a RAG system."

Latest

See all

Use case

When Policies Stop Being Documents and Start Being Understood

An education-based organisation struggled with low engagement and understanding of internal policies and procedures. By allowing staff to ask questions in plain language and receive clear, source-backed answers, the organisation removed fear and friction from compliance.

READ MORE +

Blog

The Art of Structuring Knowledge: Hierarchy vs. Search-Centric Models

Dive into the nuances of organising knowledge bases effectively by comparing hierarchy-based and search-centric models. This blog highlights the advantages, challenges, and best use cases for each structure, helping organisations choose the right approach for their users and goals.

READ MORE +

Use case

Becoming the Authority Before the First Conversation

A peak industry body wanted to become the recognised authority across the wider industry. By offering free access to its AI-powered assistant, it allowed the market to experience its expertise firsthand. SnapInsight shifted lead generation from resource-heavy outbound activity to an always-on inbound channel.

What does “good ingestion” really mean in a RAG system? A deep dive into chunking, document structure, retrieval competition, and why ingestion determines quality.

READ MORE +

Blog