Been building quietly for the past few weeks and finally ready to share.
I submitted SnapCaption AI to the Flowroom AI App Creation Contest β and honestly, I learned way more than I expected during this build.
π What it does
You upload a photo β AI actually looks at it β and in ~10 seconds you get:
- 3 platform-optimized captions
- A 3-slide Instagram story
- 30 ready-to-use hashtags
Plus 5 tone styles: Funny, Heartfelt, Poetic, Gen-Z, Professional
Works for IG, LinkedIn, Twitter.
π§ The interesting part β Vision AI
This isnβt a generic caption generator.
The AI analyzes the actual image β scene, lighting, mood β not just keywords.
But getting this right was trickyβ¦
- It would hallucinate random stuff (βmidnight vibesβ on a bright beach pic π
)
- Sometimes added irrelevant context like timestamps
- Had to build a validation + retry layer to clean outputs
π οΈ Stack
- Flowroom (app infra)
- GPT-4o Vision (image + captions)
- Vanilla JS
- CSS animations
Tried to make it feel like a real product, not just a demo.
π‘ 3 key learnings
1. Vision AI = prompt engineering gameIf your prompt isnβt tight, the output drifts fast.
2. UX matters more than AILoading states, copy buttons, history β spent more time here than on the model.
3. Shipping > perfectHad bugs till the last minute. Real-world fixes hit different vs sandbox builds.
π Try it
If you post content regularly, would love your feedback (especially on caption quality across different images)
Happy to answer anything about the build π