This open-source ā¤ļø repo gives you everything you need to learn and build a Retrieval-Augmented Generation (RAG) application from scratch. Itās a complete, hands-on resource that walks through the entire RAG pipeline covering both the fundamentals and advanced techniques like multi-querying, routing, and custom retrieval workflows.
Each notebook is designed as a practical guide, helping you move step-by-step from understanding the basics to experimenting with real-world implementations.
What it covers:
- Query Construction - Learn how to translate natural language into structured queries across SQL, Cypher, or vector search. (Text-to-SQL, Text-to-Cypher, Self-Query Retriever)
- Query Translation - Improve retrieval quality through decomposition and rephrasing. (Multi-query, RAG-Fusion, Hypothetical Docs)
- Routing - Dynamically select the most relevant database or embedding context for each query.
- Retrieval - Use advanced techniques like Re-Rank, RankGPT, RAG-Fusion, or CRAG to refine results - even pull live data from external sources.
- Indexing - Explore multi-representation embeddings, hierarchical summarization, and optimization methods. (RAPTOR, CoLBERT, fine-tuning)
- Generation - Enhance response quality with iterative reasoning and retrieval loops using Self-RAG and RRR.
If you want to understand RAG inside out and build your own system from the ground up this repo is the perfect starting point.