RAG for accurate retrieval?
What is the best practice for setting up your data to make sure it’s embedded properly for RAG vector so that the AI agent can really retrieve it correctly and accurately? I see a lot of YouTube videos with guys doing this, but it’s obviously for show over simplified, and the quality of retrieval has to be poor or average at best. What are the real best practices in terms of doing this? For example, should you build your knowledge source in a Google sheet that’s one for one matching the vector table structure in Supabase? With metadata for …? Or Use markdown in doc? Predetermine chunk size and edit knowledge source to match desired chunk size instead of relying on “random” chunking? How do you set your raw data up to get the most accurate retrieval from AI agent? (Let’s assume n8n Q&A AI chat agent with Supabase vector and 1,000 rows of unique data from online course material.) Thanks!!