🎙️ How Can I Create My Own Voice AI Agent (Like VAPI) but for My Language?
Hey everyone 👋
I’m trying to figure out how to build my own voice AI agent, something similar to VAPI Retell..., but custom for my language and accents.
We have over 40 accents, and I want to fine-tune a model that can actually understand and speak like people from each region — not just transcribe words, but really adapt to local pronunciation, tone, and vocabulary.
I have experience in full-stack development (React, Spring Boot, Go, etc.), but I’ve never done fine-tuning or building voice pipelines before.
So my main questions are:
  • 🧠 What’s the best way to start building a custom voice AI like this?
  • 🗣️ How can I collect and fine-tune on many accents effectively?
  • ⚙️ What models, tools, or frameworks do you recommend for speech-to-text, text-to-speech, and conversation handling?
  • 💾 Do I need to build my own dataset, or can I use / adapt open ones?
If anyone here has worked on custom speech AI or VAPI-like systems, I’d really appreciate some pointers 🙏
Thanks in advance!— Khaled Rouissi 🚀
4
7 comments
Khalid Rouissi
2
🎙️ How Can I Create My Own Voice AI Agent (Like VAPI) but for My Language?
powered by
Open Source Voice AI Community
skool.com/open-source-voice-ai-community-6088
Voice AI made open: Learn to build voice agents with Livekit & Pipecat and uncover what the closed platforms are hiding.
Build your own community
Bring people together around your passion and get paid.
Powered by