Activity
Mon
Wed
Fri
Sun
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
What is this?
Less
More

Memberships

AI Automation Society

248.1k members • Free

4 contributions to AI Automation Society
RAG - how accurate can it get?
Hello, I have bulidt a RAG with the following setup: Google sheet whit urls I fill in. N8n fetches pages, turn them in to md, clean with a javascript, agent 1 cleans the text, agent 2 creates key words, topics and metadata, code for chunking, openai small embedding and then it upsert to Supabase. The end result is a database with relevant info. It has apox 100 urls embedded. The content is info regarding services from the city and different volunteer organizations. I use a sandwich and vector search setup with my chatbot. Some questions are answered perfectly, but some are just bad. So my questions are; How accurate should I expect the answers to be? Should I use different supabase tables for content and contact info? Should I use openai large og voyage in stead of openai small? I run a small non profit organization and are trying to make something that is helpfull for people searching for help, but I am struggeling..
0 likes • 60m
@Hicham Char Thank you! Hybrid search, is that using both vector and text?
My RAG Chatbot Had 400 Documents But Gave Garbage Answers (The Document Quality Fix) 🔥
Built perfect RAG system for client knowledge base. Indexed 400 documents. Beautiful vector search. Lightning-fast retrieval. One problem: Completely useless answers. "What's our refund policy?" Agent: "I found 3 documents mentioning refunds." That's not an answer. That's a search result. Client needed actual answers from policy documents, not document lists. THE IRONY: RAG system using community template. Embedding, Qdrant vector store, retrieval logic - all brilliant. But feeding it garbage document text. Scanned PDFs with broken parsing. Tables rendered as random characters. Multi-column layouts reading wrong direction. Vector store full of corrupted text. Agent retrieving nonsense. Confidently wrong. DISCOVERY MOMENT: Checked what the RAG actually stored. Policy document saying "NET 30 PAYMENT TERMS" got indexed as "N E T 3 0 P A Y M E N T T E R M S" with random line breaks. Agent couldn't match queries because stored text was destroyed during basic PDF extraction. Perfect RAG. Broken input. THE FIX: Added document preprocessing before RAG ingestion. Parse documents properly FIRST → Clean structured text → THEN feed to vector store. Now extracts: Tables stay tables. Multi-column reads correctly. Headers separate from body text. Scans get OCR'd properly. TRANSFORMATION: Same question: "What's our refund policy?" Before: "I found 3 documents mentioning refunds" After: "Full refund within 30 days if unused. After 30 days, store credit only. Shipping not refundable. See Section 4.2 of Customer Policy." Same RAG template. Just clean document input. THE NUMBERS: 400 documents reprocessed with proper parsing Query accuracy: 94% correct answers now Response includes: Specific policy details with section citations Client feedback: Finally usable Setup time: 45 minutes to add preprocessing Documents processed: Handles PDFs, Word, scanned images Monthly savings: 8 hours answering policy questions manually THE PATTERN: RAG quality depends entirely on document quality going into vector store.
1 like • Nov '25
Thank you! I have the same problem. How have you set up the cleaning flow?
🚀New Video: Build Your First RAG Pipeline for Better RAG (step-by-step)
If you’re building RAG agents in n8n, this is one of the most important tutorials you’ll ever watch. In this step-by-step video, I’ll show you how to build a RAG (Retrieval-Augmented Generation) pipeline completely with no code. This setup automatically keeps your database synced with your source files, so when you update or delete a file, your database updates too. That means your AI agents always search through accurate, trustworthy data instead of outdated information. Without this system in place, you can’t rely on your AI’s answers at all. By the end of this video, you’ll understand exactly how to connect everything inside n8n, Google Drive, and Supabase, even if you’re a complete beginner.
2 likes • Oct '25
Thank you Nate!
2 likes • Oct '25
@Nate Herk This was great, but how can I learn the best way to prosess data? I need a vector database with many different documents of different size. So in order to get an accuracte query with a chatbot, I need do chunk and add keywords etc. to the chunks. I have spent many hours trying to make this using GPT as my programer, but I do not get the result I need. Any tips or videos I can use?
I’m giving you 3 new n8n templates (these are perfect for starting an agency)
I just finished building 3 automation templates that are absolute game-changers if you're trying to start an automation agency. A sales agent, proposal generation agent, and client onboarding agent. These are the exact workflows agencies charge thousands to set up, and I'm giving them to you for free. But here's the catch: I'm not just handing you the templates and walking away. I'm running a free challenge where we'll walk you through building each one step-by-step so you actually understand how they work. By the end, you'll have 3 polished automations you can use to run your own agency or show to potential clients. Click here to get all the details on the challenge See you inside. Cheers, Nate
5 likes • Oct '25
Thank you for the inspiration @Nate Herk ! I am overwelmed by all the possibilites, so one prime challenge for me is to focus on one thing :)
1-4 of 4
@espen-watne-andresen-3955
I run an NGO and are looking for ways to maximize our impact.

Active 53m ago
Joined Oct 20, 2025
Powered by