Skip to content

Latest commit

 

History

History
127 lines (70 loc) · 6.61 KB

File metadata and controls

127 lines (70 loc) · 6.61 KB

Retrieval-Augmented Generation (RAG)

References

  • https://en.wikipedia.org/wiki/Retrieval-augmented_generation
    • "Retrieval-augmented generation (RAG) is a technique that enables large language models (LLMs) to retrieve and incorporate new information.[1] With RAG, LLMs do not respond to user queries until they refer to a specified set of documents. These documents supplement information from the LLM's pre-existing training data.[2] This allows LLMs to use domain-specific and/or updated information that is not available in the training data.[2] For example, this helps LLM-based chatbots access internal company data or generate responses based on authoritative sources."

    • "RAG improves large language models (LLMs) by incorporating information retrieval before generating responses."

    • "Unlike LLMs that rely on static training data, RAG pulls relevant text from databases, uploaded documents, or web sources."

    • "RAG is a way of improving LLM performance, in essence by blending the LLM process with a web search or other document look-up process to help LLMs stick to the facts."

    • LLMs with RAG are programmed to prioritize new information. This technique has been called "prompt stuffing." Without prompt stuffing, the LLM's input is generated by a user; with prompt stuffing, additional relevant context is added to this input to guide the model’s response. This approach provides the LLM with key information early in the prompt, encouraging it to prioritize the supplied data over pre-existing training knowledge.

    • Challenges:

      • "RAG does not prevent hallucinations in LLMs."
      • "While RAG improves the accuracy of large language models (LLMs), it does not eliminate all challenges."
      • "LLMs may struggle to recognize when they lack sufficient information to provide a reliable response."
      • "Without specific training, models may generate answers even when they should indicate uncertainty."
      • a "model [may lack] the ability to assess its own knowledge limitations."
      • "RAG systems may retrieve factually correct but misleading sources, leading to errors in interpretation."
      • "an LLM may extract statements from a source without considering its context, resulting in an incorrect conclusion."
      • "when faced with conflicting information RAG models may struggle to determine which source is accurate."
      • "The worst case outcome of this limitation is that the model may combine details from multiple sources producing responses that merge outdated and updated information in a misleading manner."

Suggested Background Reading

RAG Development & Optimization (10 Part Series)

  1. RAG Performance Optimization Engineering Practice: Implementation Guide Based on LangChain
  1. Optimizing RAG Indexing Strategy: Multi-Vector Indexing and Parent Document Retrieval
  1. RAG Retrieval Performance Enhancement Practices: Detailed Explanation of Hybrid Retrieval and Self-Query Techniques
  1. Comprehensive Performance Optimization for RAG Applications: Six Key Stages from Query to Generation
  1. In-Depth Understanding of RAG Query Transformation Optimization: Multi-Query, Problem Decomposition, and Step-Back
  1. RAG Application Optimization Strategies: From Document Processing to Retrieval Techniques
  1. Customizing LangChain Components: Building a Personalized RAG Application
  1. Detailed Explanation of LangChain's Vector Storage and Retrieval Technology
  1. Introduction to RAG Application Development: Comprehensive Analysis of LangChain Document Processing

  2. In-Depth Understanding of LangChain's Document Splitting Technology

Risks & Limitations

Advanced RAG

Simple RAG

Hybrid RAG

Agentic RAG

Graph RAG