We’re working with a newsletter agency that wants their competitor research fully automated.
Right now, their team has to manually:
- Subscribe to dozens of newsletters
- Read every new issue
- Track patterns (hooks, formats, CTAs, ads, tone, sections, writing style)
- Reverse-engineer audience + growth strategies
We’re trying to take that entire workflow and turn it into a single “run analysis” action.
High-level goal:
- Efficiently scrape competitor newsletters
- Structure them into a compressed format
- Run parallel issue-level analyses
- Aggregate insights across competitors
- Produce analytics-style outputs
- Track every request through the whole distributed system
How the system works (current design):
Step 1 – You trigger an analysis
You give the niche. The system finds relevant competitors.
Step 2 – Scraper fetches issues
Our engine pulls their latest issues, cleans them, and prepares them for analysis.
Step 3 – Convert each issue into a “structured compact format”
Instead of sending messy HTML to the LLM, we:
- extract sections, visuals, links, CTAs, and copy
- convert them into a structured, compressed representationThis cuts token usage down heavily.
Step 4 – LLM analyzes each issue
We ask the model to:
- detect tone
- extract key insights
- identify intent
- spot promotional content
- summarize sections
Step 5 – System aggregates insights
Across all issues from all competitors.
Step 6 – Results surface in a dashboard / API layer
So the team can actually use the insights, not just stare at prompts.
Now I’m very curious: what tech would you use to build this, and how would you orchestrate it?
P.S. We avoid n8n-style builders here — they’re fun until you need multi-step agents, custom token compression, caching, and real error handling across a distributed workload. At that point, “boring” Python + queues starts looking very attractive again.