Been building quietly for the past few weeks and finally ready to share.
I submitted SnapCaption AI to the Flowroom AI App Creation Contest โ and honestly, I learned way more than I expected during this build.
๐ What it does
You upload a photo โ AI actually looks at it โ and in ~10 seconds you get:
- 3 platform-optimized captions
- A 3-slide Instagram story
- 30 ready-to-use hashtags
Plus 5 tone styles: Funny, Heartfelt, Poetic, Gen-Z, Professional
Works for IG, LinkedIn, Twitter.
๐ง The interesting part โ Vision AI
This isnโt a generic caption generator.
The AI analyzes the actual image โ scene, lighting, mood โ not just keywords.
But getting this right was trickyโฆ
- It would hallucinate random stuff (โmidnight vibesโ on a bright beach pic ๐
)
- Sometimes added irrelevant context like timestamps
- Had to build a validation + retry layer to clean outputs
๐ ๏ธ Stack
- Flowroom (app infra)
- GPT-4o Vision (image + captions)
- Vanilla JS
- CSS animations
Tried to make it feel like a real product, not just a demo.
๐ก 3 key learnings
1. Vision AI = prompt engineering gameIf your prompt isnโt tight, the output drifts fast.
2. UX matters more than AILoading states, copy buttons, history โ spent more time here than on the model.
3. Shipping > perfectHad bugs till the last minute. Real-world fixes hit different vs sandbox builds.
๐ Try it
If you post content regularly, would love your feedback (especially on caption quality across different images)
Happy to answer anything about the build ๐