Hey everyone! I'm trying to integrate GeminiLive S2S (speech-to-speech) with pipecat-flows for a healthcare booking agent.
The Problem:
When pipecat-flows transitions between nodes, it sends LLMSetToolsFrame to update available tools. GeminiLive requires WebSocket reconnection when tools change (API limitation). After reconnection, the conversation state breaks and Gemini doesn't follow the new node's task messages to call functions.
What works:
- OpenAI LLM + Azure STT + ElevenLabs TTS with pipecat-flows ✅
- Tool updates happen seamlessly, no reconnection needed
What doesn't work:
- GeminiLive S2S + pipecat-flows ❌
- Every node transition → reconnection → broken flow
Current workaround attempts:
- Monkey-patched process_frame to handle LLMSetToolsFrame
- Wait for session ready after reconnection
- Trigger inference with new context messages
- Still inconsistent behavior
Questions:
1. Has anyone successfully used GeminiLive with pipecat-flows?
2. Is there a recommended pattern for handling tool updates without reconnection?
3. Should I create a custom adapter that pre-registers all tools at connection time?
Any guidance appreciated! 🙏