New NVIDIA open model for voice agents: Nemotron Speech ASR
NVIDIA released a new open source speech-to-text model designed from the ground up for low-latency use cases like voice agents. This is part of NVIDIA's new focus on open models, which I'm excited about. These new models in the Nemotron family include STT and TTS models, specialized models like guardrail models and LLMs. And they are completely open: open weights, training code, training data sets, and inference tooling.
This new STT model is very fast. Here's a voice agent running locally on my RTX 5090 with sub-500ms voice-to-voice inference.
Also, Twitter and LinkedIn if either of those platforms are your thing. (I post a lot about voice agents on both platforms.)
5
2 comments
Kwindla Kramer
3
New NVIDIA open model for voice agents: Nemotron Speech ASR
powered by
Open Source Voice AI Community
skool.com/open-source-voice-ai-community-6088
Voice AI made open: Learn to build voice agents with Livekit & Pipecat and uncover what the closed platforms are hiding.
Build your own community
Bring people together around your passion and get paid.
Powered by