Looking for advice:
I have a client project that requires building a large, ethical, and compliant dataset of up to 10M Indian IT professional records. Deliverables are CSV batches of 100k profiles each.
Priority fields: email, phone, full name, job title, employer, location, profile URL.Other optional fields: education, skills, headline/summary, photo, provenance metadata.
Important: No scraping that violates TOS, no bypassing, no private data. Only public/consented data, ideally from vendors, partners, or opt-in sources.
Has anyone here successfully done something at this scale (10M+)?
- What vendors, integrations, or strategies worked best for you?
- Any recommended verification pipelines for email/phone at this volume?
Appreciate any pointers.