Automating timecard PDF → structured data (OCR + PDFVector + n8n)

Hi everyone 👋I’m building a SaaS where users currently upload an Excel file that is manually created from daily worker timecard PDFs.

I’ve attached:

Goal

Allow users to upload the PDF directly and automatically extract, per worker:

Challenges

Plan

Questions

Is PDFVector reliable for row-level timecard extraction, or better as a helper only?
Best OCR + extraction approach for scanned timecards?
How would you design this pipeline for reliability at scale?

Appreciate any guidance or real-world experience 🙏

3 comments

skool.com/ai-first-client-formula-8589

From zero to first $1k/month with AI automation in 30 days. Get the exact formula + templates that landed 100+ their first client.

Bring people together around your passion and get paid.