Mohammad Mussab

Open Source Voice AI Community

Activity

Mon

Wed

Fri

Sun

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

Jan

Feb

Mar

What is this?

Less

Memberships

10K Club | Sell With Usama

1k members • Free

Open Source Voice AI Community

879 members • Free

36 contributions to Open Source Voice AI Community

Mohammad Mussab

Jan 15 •

Pipecat

GeminiLive S2S + pipecat-flows Integration Issue

Hey everyone! I'm trying to integrate GeminiLive S2S (speech-to-speech) with pipecat-flows for a healthcare booking agent. The Problem: When pipecat-flows transitions between nodes, it sends LLMSetToolsFrame to update available tools. GeminiLive requires WebSocket reconnection when tools change (API limitation). After reconnection, the conversation state breaks and Gemini doesn't follow the new node's task messages to call functions. What works: - OpenAI LLM + Azure STT + ElevenLabs TTS with pipecat-flows ✅ - Tool updates happen seamlessly, no reconnection needed What doesn't work: - GeminiLive S2S + pipecat-flows ❌ - Every node transition → reconnection → broken flow Current workaround attempts: - Monkey-patched process_frame to handle LLMSetToolsFrame - Wait for session ready after reconnection - Trigger inference with new context messages - Still inconsistent behavior Questions: 1. Has anyone successfully used GeminiLive with pipecat-flows? 2. Is there a recommended pattern for handling tool updates without reconnection? 3. Should I create a custom adapter that pre-registers all tools at connection time? Any guidance appreciated! 🙏

New comment Jan 15

Mohammad Mussab

0 likes • Jan 15

@John George @Nour aka Sanava @Kwindla Kramer @everyone

Mohammad Mussab

Nov '25 •

General discussion

Best Observability Tools for Voice AI Frameworks?

What observability tools are others using with Pipecat or similar voice AI frameworks? I've built a production voice agent using Pipecat and currently track basic metrics (call duration, sentiment, summary, transcripts) in a custom dashboard. Tomorrow it's going in production so problem I think I can face is When errors will occur, debugging is painful. My current logging approach creates massive log files that are nearly impossible to analyze efficiently when tracking down issues.

New comment Jan 15

Mohammad Mussab

2 likes • Nov '25

@Johann Tagle sure… thinking of going with langfuse but will let you know where I end up

Mohammad Mussab

0 likes • Jan 15

@Muhammad Arhan yes. They support pipecat with opentelementry

Kwindla Kramer

Jan 6 •

General discussion

New NVIDIA open model for voice agents: Nemotron Speech ASR

NVIDIA released a new open source speech-to-text model designed from the ground up for low-latency use cases like voice agents. This is part of NVIDIA's new focus on open models, which I'm excited about. These new models in the Nemotron family include STT and TTS models, specialized models like guardrail models and LLMs. And they are completely open: open weights, training code, training data sets, and inference tooling. This new STT model is very fast. Here's a voice agent running locally on my RTX 5090 with sub-500ms voice-to-voice inference. Technical write-up and link to GitHub repo: https://www.daily.co/blog/building-voice-agents-with-nvidia-open-models/ Also, Twitter and LinkedIn if either of those platforms are your thing. (I post a lot about voice agents on both platforms.) https://x.com/kwindla/status/2008601714392514722 https://www.linkedin.com/posts/kwkramer_nvidia-just-released-a-new-open-source-transcription-activity-7414368349905821696-ufuy/

New comment Jan 15

Mohammad Mussab

1 like • Jan 15

Now its too much fast.. responding faster than human 🥀🥲

Nir Simionovich

Jan 4 •

General discussion

Musings about Vibe Coding, Pipecat, LiveKit and more

So, over the past few weeks - I've been neck deep into working with PIpecat, LiveKit and Vibe Coding. Mainly, I wanted to see what kind of milage I can get from Vibe Coding tools, and in order to test it - what's a better way than build a Pipecat/LiveKit implementation? So, I decided to examine 3 primary tools: - Claude Code - Using Sonnet 3.5 (using CLI) - OpenCode - Grok Code Fast 1 - Google Antigravity - Using Gemini 2.5 Below are my conclusions, split into several categories. 💵 Financials: Most expensive to use - Claude Code Least expensive to use - OpenCode 😡 Developer Experience: Best experience - Google Antigravity Worst experience - Claude Code 💪 Reliability: Most reliable - Claude Code Least reliable - OpenCode 🚅 Performance: Fastest planning and building - Google Antigravity Slowest planning and building - OpenCode So, overall - there is no "one tool to rule them all" here - and what I found out that each tool is really good at performing specific tasks. Here is what I've learned about how to "leverage" these tools in order to build something successful: - Planning can be performed with either OpenCode of Google antigravity. Google provides free developer credits for Antigravity, and their deep-thinking and reasoning engine, when applied to software architecture and design works very well. - Backend development with either ClaudeCode or Google Antigravity. When coupled with proper topic sub-agents, these are really powerful tools. For some odd reason, Claude Code is far more capable at handling complex architectures, while Google Antigravity leans towards the "hacker style" coding. - UI/UIX development - without any question, OpenCode did a better job. It was far more capable in spitting out hundreds of lines of working UI/UX code - even faster that Claude. However, if at some point it gets stuck on a specific UI component package, it may require Claude to show it the light - so pay attention to what it's doing. - Code Review, Security and Privacy - without any question, Claude is the winner here - with potentially the most extensive availability of sub-agent topic experts.

New comment Jan 15

Mohammad Mussab

0 likes • Jan 15

@Kwindla Kramer doing the same

Mohammad Mussab

0 likes • Jan 15

Why not use oppus 4.5 of claude? And I think if we use claude max of 100usd plan is enough for me?

Nir Simionovich

Jan 15 •

General discussion

Small AI Voice Agents Questionnaire

Hello all, I'm trying to investigate a few hypothesis I have regarding the AI Voice Agent market. My questions are mostly related to security, observability, billing and load management. In order to do so, I've built the following Google Form: https://forms.gle/oFeM9J9WV9DRX9267 If you could please answer it, I would highly appreciate it - also, once I have all the data compiled - I will publish a post with all my findings, so that people can learn from this study as well. Much Appreciated.

New comment Jan 15

Mohammad Mussab

0 likes • Jan 15

Done 👍

1-10 of 36

Level 3

30points to level up

Mohammad Mussab

@mohammad-mussab-2383

I build Voice AI systems that handle customer calls, scheduling, and follow-ups — helping SMEs capture more revenue automatically.

Active 18d ago

Joined Nov 10, 2025

Pakistan

Contributions

Followers

Following