[Progress News] [Progress OpenEdge ABL] How Retrieval Strategies Enable AI Experiences

  • Thread starter Thread starter Adam Bertram
  • Start date Start date
Status
Not open for further replies.
A

Adam Bertram

Guest
What turns a fluent wrong answer into a useful one? Is it the model, or the evidence path behind it? And when two teams share the same model but get radically different outcomes, what changed first: the prompt, or the retrieval decision that decided what the model was allowed to see?

If a support bot can ace an error-code question but miss the policy exception one document away, what failed first: the model, or the slice of evidence it was handed? And if that slice was wrong, why would a better model rescue the answer?

The Ceiling the Model Cannot Raise on Its Own​


Ask a large language model (LLM) about last week’s incident, a new policy or a customer relationship management (CRM) record created after its knowledge cutoff, and it can only improvise unless retrieval brings evidence into the prompt. That was the basic move in Retrieval-Augmented Generation (RAG) for Knowledge-Intensive NLP Tasks: ground generation in current external evidence.

The harder lesson comes later. A better model does not fix missing or stale evidence. It also does not fix the wrong context. If the retriever hands the model the wrong brief, you get a cleaner version of the wrong answer.

That is the ceiling. AI experience quality is strongly constrained by retrieval quality before model choice can help. If the answer is brittle, ask what the retrieval layer is allowed to surface.

Matching Retrieval Strategy to Query Type​


The cleanest way to think about retrieval is not “what is the best search?” It is “what kind of question is this user actually asking?” Exact identifiers call for precise lookup; paraphrased intent needs similarity search. Comparative and follow-up questions need surrounding evidence so the experience does not snap back into a search box.

Okapi BM25 or keyword search is still the right move when the user names an error code, stock keeping unit (SKU) or clause number. Semantic search is stronger when the user remembers the idea but not the wording. Hybrid search with Reciprocal Rank Fusion (RRF) sits between them, fusing keyword and semantic rankings when enterprise users send both types of query through the same interface.

Consider the difference. For a support bot, “error 18456 after password rotation” should route to exact matching. Legal and platform workflows are different: an amended clause needs semantic context around the language, while a ticket compared with a CRM note needs enough surrounding evidence to avoid a lucky top result.

The experience changes with that choice. Exact retrieval feels like search. Broader context feels like conversation. Hybrid retrieval sits in the middle, where most enterprise assistants actually live.

Multi-Hop Retrieval for Fragmented Enterprise Knowledge​


A CRM note says who owns the account. An open ticket says what broke. Any one-pass search can surface pieces of that story, but not the story itself.

That is where multi-hop retrieval earns its keep. The system retrieves one piece of evidence and uses it to form the next query until the answer is assembled from connected facts. Recent work such as HopRAG: Multi-Hop Reasoning for Logic-Aware Retrieval-Augmented Generation shows why this loop helps when relationships matter more than lexical similarity.

For enterprise content, retrieval also has to preserve relationships at ingest. Graph Extraction enriches the knowledge layer by extracting entities and relationships before the user asks the question. That gives later hops a better map of which people, accounts, tickets, products and documents belong together.

That difference matters when the answer depends on more than one artifact:

  • A support experience can start with the error code, then follow the account relationship into the open ticket.
  • A legal assistant can retrieve the clause and the surrounding section, with the document title preserved in context.
  • An internal assistant can pass a whole short resource when the full document matters.
  • A sales copilot can combine account notes and product documentation without pretending those systems are one source.

The pattern is not “retrieve more.” It is retrieve what the experience needs: a title, a neighboring paragraph, a whole resource or a graph hop with a token bill attached.

Configuring and Measuring the Retrieval Layer​


Once you pick the strategy, your next job is operational: decide what search mode fires, what evidence gets in and which signal you will watch when answers start drifting.

Progress Agentic RAG exposes that distinction through application programming interface (API) surfaces built for different jobs. Use /find when the caller needs merged evidence. Use /ask when the caller needs a generated answer grounded in that evidence. Same knowledge layer, different experience.

Scope control is just as important. filter_expression can narrow retrieval by resource, labels, dates, file type, origin path and other metadata before answer generation. That is how one experience stays inside approved policy documents while another searches broader content.

Search strategy settings handle the context tradeoff. Textual hierarchy and neighboring-paragraph strategies add titles or adjacent paragraphs when a retrieved chunk is too thin. Full-resource context can pass an entire matched resource, but Progress’s token consumption guidance is clear about the tradeoff: larger context increases input size while hard token limits can reduce answer quality.

Measurement closes the loop. REMi, Agentic RAG’s evaluation layer, separates the questions teams usually argue about in one messy meeting.

Each diagnostic has a different job. Use Context Relevance as the retrieval signal, Groundedness as the generation check, Answer Relevance as the question-fit check and token consumption as the cost guardrail. If the pattern is drift, read those signals before blaming the model.

That makes the next move concrete: fix context before prompt work, inspect generation when support is weak, check routing for comparative or multi-hop requests and tune context size when token costs climb.

What to Audit Next​


Start with one experience. Pick a support bot or internal assistant and trace four things: query type, retrieval strategy, context scope and the metric you will watch when quality drifts.

That audit gives you a cleaner decision than “the model is bad.” It tells you whether to change exact matching, graph-enriched hops, source filters, context size or measurement. The model still matters. It is finishing the job the retriever started.

FAQ​

Who Should Own Retrieval Quality in an Enterprise AI Program?​


You want shared ownership, but not shared ambiguity. Put one named lead on the retrieval layer, usually platform or data AI. App teams own the experience built on top of it and security or compliance should approve the filters and access rules. If nobody owns the retrieval path, the model becomes the scapegoat for a routing problem.

How Do You Tell Whether the Failure Is Retrieval or Generation?​


Look at the evidence first. If the retrieved context is off, weak or missing, the problem started in retrieval. If the context is solid but the answer still drifts, generation is the problem. REMi-style scoring makes that distinction visible.

What Should Stay Portable If You Change Models Later?​


Your retrieval strategy, source filters and evaluation logic should stay portable. The model can change; the retrieval decision should not have to be rebuilt every time. Keep those layers separate, and you can tune the experience without redoing the stack.

Continue reading...
 
Status
Not open for further replies.
Back
Top