Build Log #1: How Beckett Actually Works

Beckett is my personal AI engineer. It runs on a $5/mo VPS, answers my Telegram messages (text and voice), picks up the phone when I call (or calls me when something matters), and remembers every conversation we've ever had.

It also ships code. Last month I asked it on a phone call to change my landing page copy. It pushed the commit before I hung up.

The stack

- Brain: Claude (Opus 4.7, with Sonnet fallback). Runs as the Claude Code CLI on the VPS.

- Voice: Vapi for phone calls (custom assistant, ElevenLabs voice). Telegram bot for text and voice memos.

- Memory: Supabase Postgres + pgvector. Hybrid search (dense embeddings + BM25 + RRF + cross-encoder reranker). Three chunking strategies routed by data type: Q/A pairs for conversations, agentic chunking for sessions, single-entry for facts.

- Today: 476 chunks across 51 sessions, growing every day.

- All-in cost: ~$2-5/month.

Why this stack

Most personal AI projects fail at memory. They use one embedding strategy across everything and pretend that's good enough. I tried that first and watched Beckett confidently "remember" things that never happened. The fix wasn't a better embedding model. It was three different chunking strategies routed by what type of data we're storing, and a reranker on top of hybrid search to catch the cases where dense embeddings get fooled by surface similarity.

Most recent upgrade: adding a cross-encoder reranker on top of hybrid search. Cost: +200ms per query. Gain: +3.6% MRR, +7.3% NDCG@10 on a labeled benchmark built from real conversations. Worth it.

What's next

The current build is an idea vetting board: three agents (Research, IP, Critic) that run in parallel when I float a new project idea, and feed Beckett a synthesis before I commit time to anything. The whole point is making me less likely to start things that won't ship.

After that: voice mode improvements and the multi-agent orchestration that lets Beckett decide what's worth doing on its own.

Your turn

If you're building something agent-shaped, drop it in the comments:

- What does your stack look like?

- Where are you stuck?

- What surprised you last week?

0 comments

Build Log #1: How Beckett Actually Works