I’m interested in joining the Dojo. Right now, I’m exploring Fabric implementation with a focus on building a scalable and reusable ingestion framework, ideally using Pyspark and notebooks.
What I’m looking for is guidance on designing a modular, reusable system—for example, if I already have data in Delta Lake folders or SQL sources, how do I set up a framework that can handle ingestion without building everything from scratch each time?
I’d like to create a core ingestion engine, then extend that with frameworks for data quality, data governance, etc. Essentially, I want to build reusable patterns instead of reinventing the wheel for each new dataset or process.
Does the Dojo offer content or best practices for this kind of architectural approach? Something like a plug-and-play framework setup that we can adapt and scale? thanks