TL;DR: open-source desktop dictation app that sits next to every AI chat window you have open. Local Whisper for speech-to-text. Wake-word listening so you don't even click. Optional AI parse via Ollama, OpenAI, or Anthropic. MIT, free, no telemetry.
🔗 PyPI: pip install agenius-note
The problem it solves
I usually have three Claude windows, a VS Code Claude session, ChatGPT, and a local Ollama open at the same time. Every prompt I type is a context-switch tax. The Whisper-based dictation built into macOS / Windows works but lives in the OS chrome, not next to my chats. Claude Desktop has voice but it's tied to Claude. I wanted one mic for everything.
What it does
🎙 Push-to-talk dictation, or hands-free wake word (default "hey jarvis", train your own .onnx in the openWakeWord Colab notebook)
🧠 Pipes the transcript to your LLM of choice for a structured note/task parse. Or skip the LLM and just take raw transcripts
✅ Quick todos pane on the right ("hey jarvis, remind me to walk the dogs in an hour" routes straight in)
📋 Quick Note brain-dump panel. Talk a thought through, hit Copy, paste into your next prompt. Nothing stored
🔐 API keys live in OS keyring (Keychain / Credential Manager / Secret Service). No plaintext on disk
Stack
PySide6, faster-whisper (CUDA float16 with CPU fallback), openWakeWord, SQLite, Ollama / OpenAI / Anthropic. ~5K LOC. MIT license.
It runs everywhere
Windows, macOS, Linux. Pre-built bundles on the GitHub Releases page (SmartScreen and Gatekeeper will yell on first launch, we're not code-signed yet). Or pip install agenius-note if you have Python 3.10+.
What I'd love feedback on
- What wake word would you train (mine is going to be "hey biggie" once I get around to it)
- Which LLM backend feels best for your workflow
- What's missing (next on my list: a chat panel inside Quick Note so I can ask the AI things without leaving the app)
If you build something with it or hit a snag, drop a comment or an issue on the repo. Happy to help wire it up.