๐๏ธ How Can I Create My Own Voice AI Agent (Like VAPI) but for My Language?
Hey everyone ๐ Iโm trying to figure out how to build my own voice AI agent, something similar to VAPI Retell..., but custom for my language and accents. We have over 40 accents, and I want to fine-tune a model that can actually understand and speak like people from each region โ not just transcribe words, but really adapt to local pronunciation, tone, and vocabulary. I have experience in full-stack development (React, Spring Boot, Go, etc.), but Iโve never done fine-tuning or building voice pipelines before. So my main questions are: - ๐ง Whatโs the best way to start building a custom voice AI like this? - ๐ฃ๏ธ How can I collect and fine-tune on many accents effectively? - โ๏ธ What models, tools, or frameworks do you recommend for speech-to-text, text-to-speech, and conversation handling? - ๐พ Do I need to build my own dataset, or can I use / adapt open ones? If anyone here has worked on custom speech AI or VAPI-like systems, Iโd really appreciate some pointers ๐ Thanks in advance!โ Khaled Rouissi ๐