David Buchalter

AI Developer Accelerator

Activity

Mon

Wed

Fri

Sun

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Jan

Feb

Mar

Apr

May

What is this?

Less

Memberships

AI Automation Society

368.9k members • Free

AI Developer Accelerator

11.3k members • Free

1 contribution to AI Developer Accelerator

Brandon Hancock

18h •

General discussion

The RAG Pipeline That Actually Works on Meeting Transcripts (With Patrick Chouinard)

Hey guys! I just sat down with @Patrick Chouinard for one of the coolest deep dives we've done in the community. Patrick has become our community's go-to AI expert and one of the most helpful resources we have. He's quietly been building something most teams pay a vendor $50k a year for. He's turning every community call we've ever recorded (two and a half years of two to three hour conversations) into a queryable "community brain." You'll soon be able to ask it anything that's been discussed and get a real answer with citations. Here's the part that broke my brain. Standard RAG completely falls apart on transcripts. The question gets asked at minute 6, the conversation drifts, and the real answer shows up at minute 41. A normal chunker has no idea those two moments belong to the same idea. Patrick solved it by adding an LLM analysis layer BEFORE chunking that restructures the transcript into self-contained units of knowledge. You can watch the full breakdown above! Here's everything we covered: ✅ Why traditional chunk-and-embed fails on non-linear data ✅ The LLM analysis layer that turns raw transcripts into RAG-ready knowledge ✅ How Patrick picks the cheapest model that's still smart enough for each step (Kimi K2.5 for restructuring, Sonnet 4.6 for the signal extraction) ✅ Why he chose LanceDB over Pinecone (and when you'd flip that decision) ✅ Running the whole thing locally with Ollama, Open Web UI, Gemma 4 4B, and gpt-oss:20B for more complex retrieval ✅ Using Claude Code to build a custom chunker instead of fighting with a library ✅ Real cost math. About 40 cents per two hour episode and under $100 to process the entire archive The wildest part is the price. Once the embeddings are built, querying is free forever because everything runs on your own machine. No SaaS, no per-token cost, no IT review. Patrick is also planning to open source the full pipeline once a few rough edges are ironed out. So if you've been wanting to build something like this for your own team, agency, or client, you'll have a working blueprint to start from.

New comment 9h ago

David Buchalter

1 like • 17h

Amazing great to see you’re back @Brandon Hancock

1-1 of 1

Level 1 - Bit Newbie

4points to level up

David Buchalter

@david-buchalter-4190

Co-founder and CTO of Cure Bytes AI

Active 17h ago

Joined Sep 26, 2025

Contributions

Followers

Following