๐ŸŽ™๏ธ How Can I Create My Own Voice AI Agent (Like VAPI) but for My Language?
Hey everyone ๐Ÿ‘‹
Iโ€™m trying to figure out how to build my own voice AI agent, something similar to VAPI Retell..., but custom for my language and accents.
We have over 40 accents, and I want to fine-tune a model that can actually understand and speak like people from each region โ€” not just transcribe words, but really adapt to local pronunciation, tone, and vocabulary.
I have experience in full-stack development (React, Spring Boot, Go, etc.), but Iโ€™ve never done fine-tuning or building voice pipelines before.
So my main questions are:
  • ๐Ÿง  Whatโ€™s the best way to start building a custom voice AI like this?
  • ๐Ÿ—ฃ๏ธ How can I collect and fine-tune on many accents effectively?
  • โš™๏ธ What models, tools, or frameworks do you recommend for speech-to-text, text-to-speech, and conversation handling?
  • ๐Ÿ’พ Do I need to build my own dataset, or can I use / adapt open ones?
If anyone here has worked on custom speech AI or VAPI-like systems, Iโ€™d really appreciate some pointers ๐Ÿ™
Thanks in advance!โ€” Khaled Rouissi ๐Ÿš€
4
7 comments
Khalid Rouissi
2
๐ŸŽ™๏ธ How Can I Create My Own Voice AI Agent (Like VAPI) but for My Language?
powered by
Open Source Voice AI Community
skool.com/open-source-voice-ai-community-6088
Voice AI made open: Learn to build voice agents with Livekit & Pipecat and uncover what the closed platforms are hiding.