Activity
Mon
Wed
Fri
Sun
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
What is this?
Less
More

Memberships

Open Source Voice AI Community

873 members • Free

1 contribution to Open Source Voice AI Community
Built a Full AI Pipeline on One Laptop — Voice Is Next
Hey everyone — been building local-first AI infrastructure and this community is exactly my vibe. I run a full AI pipeline from a single laptop (RTX 5080 16GB, 32GB DDR5) — Ollama in Docker with GPU passthrough, PostgreSQL, Redis. The philosophy: 80% of AI workload runs on free local models, only the 20% that needs frontier reasoning hits a cloud API. Cost per pipeline run dropped from $8-15 to $0.15-0.40. I've shipped a few tools with this setup — market scanners, a knowledge retention engine with RAG, and a live SaaS API product. All from the same machine. What brought me here: I want to add a voice layer. Seeing folks run Pipecat with local STT/TTS on consumer GPUs is exactly the direction I'm heading. My Ollama stack already handles LLM inference — pairing that with local Whisper or the new NVIDIA Nemotron STT model on the same GPU seems like the natural next step. A few things from the recent threads caught my eye: - @Kwindla's sub-500ms voice-to-voice on an RTX 5090 with Nemotron — curious how that scales down to a 5080 with 16GB VRAM when the LLM is also loaded - @Jin Park's custom orchestration engine replacing Vapi/Retell — that modular approach maps directly to how I route pipeline stages between local and cloud models - The latency discussion around local vs cloud STT — has anyone benchmarked Whisper locally against Deepgram for voice agent round-trip times? Looking forward to learning from this group and sharing what I build.
1 like • 6d
@Eric Klein Appreciate the Dograh callout — hadn't looked at it closely before. The self-hostable angle and BYOM approach are solid, and I like that they're going after the Vapi/Retell lock-in problem head-on. That said, my work sits in a different part of the local AI landscape. I'm less interested in telephony and conversation routing and more focused on what happens when you treat local models as infrastructure rather than endpoints — meaning the intelligence layer lives on your machine, serves multiple tools, and costs nothing to run at steady state. The interesting problems there aren't "how do I build a voice bot" but "how do I make local models useful enough that the cloud API becomes the fallback, not the default." Dograh's solving a real problem for people building voice products. What I'm chasing is closer to the question of what a single developer can automate for themselves when compute is free and private. Different starting points, but I think both camps benefit from the open-source-first ethos winning over the black-box SaaS model. Good thread though — the more people building on local infra the better.
1-1 of 1
Rayne Robinson
2
15points to level up
@rayne-robinson-3262
Building zero-cost local AI tools. GPU-accelerated automation from a laptop.

Active 4d ago
Joined Feb 24, 2026
ESTP
Arizona