I built an automated content generation system that runs 24/7 on a Mac Mini in my house. No n8n. No Make. No Docker. No external orchestration dependencies. Pure Python, stdlib, launchd.
It publishes across 18 sites daily. Every article is quality-scored against AP Style rubrics before it goes live. Here's the part most automation builders skip: the scoring model had a bias problem.
GPT-4.1-mini's safety training bleeds into quality scoring. Political content — elections, protests, international conflict — gets reflexively penalized 3-4 out of 10 regardless of actual writing quality. The fix was chain-of-thought scoring: force the model to reason about specific criteria (headline accuracy, factual coherence, structure, tone) before outputting a score. That eliminated the topic-sensitivity reflex entirely.
The quality gate rejects anything below 5.0/10. What passes gets a hero image generated via Fal.ai, publishes through WordPress REST API, and distributes to Bluesky, Telegram, and Tumblr — all with viral scoring that tiers articles into boost, standard, or skip. Cost: $0.92/day. Budget-capped at $2/day, $10/week.
But the content generation is only half the system.
Every article embeds a 1x1 tracking pixel from a Cloudflare Worker. That pixel tells me exactly which AI crawlers are ingesting the content and when. Within hours of publishing, I can see GPTBot, ClaudeBot, ByteSpider, Meta's external agent — all hitting the content. Not guessing. Measuring.
Last week we deployed a 10-article interlinked content series across the network. 500 pixel hits in the first window. Breakdown: 14% GPTBot, 10% Meta, 4% ByteSpider, 2% ClaudeBot, 42% human readers. The content entered at least four major AI training pipelines within hours of publishing.
The system improves daily without intervention. Quality scores trend upward because the rubric catches what the model misses. Publishing cadence stays natural with randomized 13-23 minute intervals — no fixed pattern for crawlers to fingerprint. Every run logs to a SQLite database. A daily email report hits my inbox at 7:03am with per-site metrics, quality trends, cost tracking, and pixel data.
This is what I mean by truly agentic: a system that sources, writes, scores, gates, publishes, distributes, tracks, and reports — continuously, locally, on hardware I own.
I walked Hidden State Drift Mastermind members through the full architecture this week. Not the surface-level version. The quality scoring bias fix. The three-layer content model. The pixel infrastructure that proves crawler ingestion. The budget math that makes it sustainable at under $7/week.
If you're building AI-automated content systems and you're not measuring what happens after you publish, you're flying blind.