Retrieval-Augmented Generation (RAG)

References

https://en.wikipedia.org/wiki/Retrieval-augmented_generation
- "Retrieval-augmented generation (RAG) is a technique that enables large language models (LLMs) to retrieve and incorporate new information.[1] With RAG, LLMs do not respond to user queries until they refer to a specified set of documents. These documents supplement information from the LLM's pre-existing training data.[2] This allows LLMs to use domain-specific and/or updated information that is not available in the training data.[2] For example, this helps LLM-based chatbots access internal company data or generate responses based on authoritative sources."
- "RAG improves large language models (LLMs) by incorporating information retrieval before generating responses."
- "Unlike LLMs that rely on static training data, RAG pulls relevant text from databases, uploaded documents, or web sources."
- "RAG is a way of improving LLM performance, in essence by blending the LLM process with a web search or other document look-up process to help LLMs stick to the facts."
- LLMs with RAG are programmed to prioritize new information. This technique has been called "prompt stuffing." Without prompt stuffing, the LLM's input is generated by a user; with prompt stuffing, additional relevant context is added to this input to guide the model’s response. This approach provides the LLM with key information early in the prompt, encouraging it to prioritize the supplied data over pre-existing training knowledge.
- Challenges:
  - "RAG does not prevent hallucinations in LLMs."
  - "While RAG improves the accuracy of large language models (LLMs), it does not eliminate all challenges."
  - "LLMs may struggle to recognize when they lack sufficient information to provide a reliable response."
  - "Without specific training, models may generate answers even when they should indicate uncertainty."
  - a "model [may lack] the ability to assess its own knowledge limitations."
  - "RAG systems may retrieve factually correct but misleading sources, leading to errors in interpretation."
  - "an LLM may extract statements from a source without considering its context, resulting in an incorrect conclusion."
  - "when faced with conflicting information RAG models may struggle to determine which source is accurate."
  - "The worst case outcome of this limitation is that the model may combine details from multiple sources producing responses that merge outdated and updated information in a misleading manner."

Risks & Limitations

RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models
- https://aclanthology.org/2025.naacl-long.281.pdf

Advanced RAG

Advanced RAG Methods: Simple, Hybrid, Agentic, Graph Explained
- https://blog.premai.io/advanced-rag-methods-simple-hybrid-agentic-graph-explained/

Simple RAG

Hybrid RAG

What is hybrid search?
- https://www.elastic.co/what-is/hybrid-search
Hybrid Retrieval: The Architectural Backbone Behind Reliable AI Systems
Efficient and Effective Retrieval of Dense-Sparse Hybrid Vectors using Graph-based Approximate Nearest Neighbor Search
- https://arxiv.org/abs/2410.20381
Balancing the Blend: An Experimental Analysis of Trade-offs in Hybrid Search
- https://arxiv.org/abs/2508.01405
Hybrid Retrievers: Fusion Models
- https://www.emergentmind.com/topics/hybrid-retrievers
Building effective hybrid search in OpenSearch: Techniques and best practices
- https://opensearch.org/blog/building-effective-hybrid-search-in-opensearch-techniques-and-best-practices/
Relevance scoring in hybrid search using Reciprocal Rank Fusion (RRF)
- https://learn.microsoft.com/en-us/azure/search/hybrid-search-ranking
Hybrid search using vectors and full text in Azure AI Search
- https://learn.microsoft.com/en-us/azure/search/hybrid-search-overview
Empowering LLMs by hybrid retrieval-augmented generation for domain-centric Q&A in smart manufacturing
- https://www.sciencedirect.com/science/article/pii/S1474034625001053

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retrieval-Augmented Generation (RAG)

References

Suggested Background Reading

RAG Development & Optimization (10 Part Series)

Risks & Limitations

Advanced RAG

Simple RAG

Hybrid RAG

Agentic RAG

Graph RAG

FilesExpand file tree

RAG.md

Latest commit

History

RAG.md

File metadata and controls

Retrieval-Augmented Generation (RAG)

References

Suggested Background Reading

RAG Development & Optimization (10 Part Series)

Risks & Limitations

Advanced RAG

Simple RAG

Hybrid RAG

Agentic RAG

Graph RAG