I did this for a client last week, but this could apply to any niche where you want to become an authority.
- Load your source list• Reads a Google Sheet of community sources (FB groups, subreddits, Discord channels, etc.).• Skips any source already processed today via a “LastRan” timestamp check.
- Scrape raw posts• Uses a scraper node (Apify, RSS pull, Reddit API, etc.) to fetch the newest posts.
- Spam & noise filter (pre-AI!)• A custom JS “Filter Spam” node examines each post’s text and URL.• Drops anything with no question, promotional language, obvious lead-gen or duplicates.• Only true, question-style posts continue—so you never burn an OpenAI token on ads or fluff.
- AI analysis & response• Feeds each filtered question into a LangChain/ChatGPT agent with your strict JSON schema.• Gets back a validated object:– qualified: true/false– metadata (URL, username, datetime, location)– a concise, on-brand answer– topic tags for categorization
- Normalize & dedupe• A single Code node un-wraps LLM output (handles both output: [] and direct arrays), de-dupes by URL+question, and tallies how many Q&A items you generated.
- Write results & update status• Appends all new Q&A rows (with qualifiedCount) into your “Q/A Followups” sheet.• Updates each source row with LastRan and LastRunTotal so you won’t re-scrape the same group today.
- Notifications & error alerts• On success: posts “Processed Successfully” to Slack/Discord/Teams.• On error: immediately notifies your team with the error message—so you can fix broken selectors or API issues before they pile up.
Adapting to Any Niche
- Sources → Your community
- Spam filter tuning
- AI prompt & tags
- Output destinations
- Notification channels & cadence
With the precheck (you can adjust the settings as well) OpenAI call is on a genuine question—maximizing ROI on your token spend and keeping your community engagement both efficient and relevant.