How Retrieval Algorithms Shape Better LLM Responses?

Introduction

In the era of LLMs, specifically in Retrieval-Augmented Generation (RAG), retrieval algorithms play one of the most important roles. The better the retrieval results, the better the context provided to the LLM, and the better th…


This content originally appeared on DEV Community and was authored by Dev J. Shah 🥑

Introduction

In the era of LLMs, specifically in Retrieval-Augmented Generation (RAG), retrieval algorithms play one of the most important roles. The better the retrieval results, the better the context provided to the LLM, and the better the responses it generates.

The method of retrieving information is also the backbone of search engines. However, this blog only talks about retrieval specifically for providing context to LLMs.

The way it works is by ranking documents based on their relevance to the given query. Retrieval algorithms can be classified based on how the relevance score is computed. A relevance score is a numerical measure that indicates how well a piece of information matches a given query.

The two common retrieval methods are: Term-based Retrieval and Embedding-based Retrieval.

Term-based Retrieval

As the name suggests, Term-based Retrieval uses the keywords from the query to find the most relevant documents. However, this approach can have issues.

Many documents may contain the same keyword. Not every document can fit in the LLM’s context window. As a result, the document with the actual useful context might not get included. A simple approach is to include the document that contains the keyword the highest number of times. The number of times a term appears in the document is called Term Frequency (TF).

A query may contain multiple keywords, out of which some are more important than others. The importance of each keyword is inversely proportional to the number of documents in which it appears. The more documents a keyword appears in, the less important it becomes. This metric is called Inverse Document Frequency (IDF).

Mathematically, IDF = (Total number of documents) ÷ (Number of documents containing the keyword). A higher IDF value indicates greater importance of the keyword.

The well-known algorithm that combines these two metrics, Term Frequency (TF) and Inverse Document Frequency (IDF), is TF-IDF.

Embedding-based Retrieval

Term-based Retrieval is focused on keywords rather than meaning, which can result in irrelevant documents being retrieved. On the other hand, Embedding-based Retrieval ranks documents based on how closely they align with the query in terms of semantic meaning.

With Embedding-based Retrieval, indexing involves an additional step: converting documents into embeddings. Embeddings are high-dimensional vectors that preserve important properties of the original data. These embeddings are then stored in a specialized database called a Vector Database.

To learn more about embeddings, I recommend checking out my other blog, which explains how text is converted into embeddings and how retrieval is performed using cosine similarity, one of the most common embedding-based retrieval techniques.

Comparing Term-based and Embedding-based Retrieval

Term-based Retrieval is generally faster than Embedding-based Retrieval during both storing (indexing) and fetching (querying). However, Embedding-based Retrieval can significantly improve retrieval quality over time.

Two metrics often used in RAG to evaluate the quality of a retriever are:

ContextPrecision=RelevantretrieveddocumentsAllretrieveddocuments Context Precision = \frac{Relevant retrieved documents}{All retrieved documents} ContextPrecision=AllretrieveddocumentsRelevantretrieveddocuments

ContextRecall=RelevantretrieveddocumentsAllrelevantdocuments Context Recall = \frac{Relevant retrieved documents}{All relevant documents} ContextRecall=AllrelevantdocumentsRelevantretrieveddocuments

Another consideration is cost. Generating embeddings requires compute resources and often involves API costs. In addition, depending on the vector database, both vector storage and vector search queries can also be expensive.

Combining Retrieval Methods

Combining both retrieval algorithms is called Hybrid Search.

There are two common approaches:

  1. Sequential Combination:
    • First, use Term-based Retrieval to fetch all documents containing the keyword.
    • Then, use Embedding-based Retrieval to re-rank those documents based on semantic meaning.
  2. Parallel Combination:
    • Both retrieval methods run in parallel.
    • Each produces a ranking of documents by relevance.
    • The results are then merged or compared to generate a final ranking.

Hybrid Search allows leveraging the strengths of both approaches: the speed of keyword search and the semantic depth of embeddings.

Citation

This blog is inspired by the “Retrieval Algorithms” topic in the book “AI Engineer” by Chip Huyen. This is a brief introduction to the topic. To learn more in detail, I recommend referring to the book.


This content originally appeared on DEV Community and was authored by Dev J. Shah 🥑


Print Share Comment Cite Upload Translate Updates
APA

Dev J. Shah 🥑 | Sciencx (2025-08-26T00:34:43+00:00) How Retrieval Algorithms Shape Better LLM Responses?. Retrieved from https://www.scien.cx/2025/08/26/how-retrieval-algorithms-shape-better-llm-responses/

MLA
" » How Retrieval Algorithms Shape Better LLM Responses?." Dev J. Shah 🥑 | Sciencx - Tuesday August 26, 2025, https://www.scien.cx/2025/08/26/how-retrieval-algorithms-shape-better-llm-responses/
HARVARD
Dev J. Shah 🥑 | Sciencx Tuesday August 26, 2025 » How Retrieval Algorithms Shape Better LLM Responses?., viewed ,<https://www.scien.cx/2025/08/26/how-retrieval-algorithms-shape-better-llm-responses/>
VANCOUVER
Dev J. Shah 🥑 | Sciencx - » How Retrieval Algorithms Shape Better LLM Responses?. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2025/08/26/how-retrieval-algorithms-shape-better-llm-responses/
CHICAGO
" » How Retrieval Algorithms Shape Better LLM Responses?." Dev J. Shah 🥑 | Sciencx - Accessed . https://www.scien.cx/2025/08/26/how-retrieval-algorithms-shape-better-llm-responses/
IEEE
" » How Retrieval Algorithms Shape Better LLM Responses?." Dev J. Shah 🥑 | Sciencx [Online]. Available: https://www.scien.cx/2025/08/26/how-retrieval-algorithms-shape-better-llm-responses/. [Accessed: ]
rf:citation
» How Retrieval Algorithms Shape Better LLM Responses? | Dev J. Shah 🥑 | Sciencx | https://www.scien.cx/2025/08/26/how-retrieval-algorithms-shape-better-llm-responses/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.