User
Write something
Pinned
[Start Here] Welcome to The Modern Data Community!
Hello! Welcome to The Modern Data Community. The goal of this community is to help Data Engineers on small (or solo) teams confidently build modern architectures by simplifying key concepts, clarifying common strategies & learning from others. Pumped to have you here! ==================== HOW IT WORKS ==================== By joining, you get instant access to tons of free content (see Classroom). Dive right in. But even more can be unlocked by contributing to the Community, which I encourage you to do. It works like this: Contribute (post/comment) >> Get points (likes) >> Reach new levels >> Unlock content ==================== 6 SIMPLE GUIDELINES ==================== ❌ Do not post error messages looking for others to debug your code. That's why Stack Overflow and other tool-specific Slack channels exist. ❌ Do not use this community for self-promotion (unless Admin approved). We all know it when we see it. ❌ Do not create low-quality posts with poor grammar/spelling or that provide little (or no) value. These will be deleted. You can do better! ✅ Ask questions, share your experiences & overall be a good person. This not only helps everyone get better, but can help you unlock bonus content faster. Win-Win. ✅ Speaking of wins, share yours! Whether it's finally solving a complex problem, hitting a team milestone or starting a new gig - post about it. You'll get the props you deserve and it just might inspire somebody else. ✅ Take the time to craft thoughtful posts & always proof-read before hitting submit. We're all about quality here. High quality posts --> more engagement (aka you'll climb the leaderboard & unlock content) --> ensures the community stays enjoyable for everyone. ==================== QUICK LINKS ==================== Here are a few links to help you get going: - Classroom - What's Your Data Stack? - Leaderboard - Work with me (Kahan Data Solutions)
30
122
New comment 13d ago
Pinned
What's your Data Stack?
It's one thing to read articles or watch videos about perfectly crafted data architectures and think you're way behind. But back here in reality, things get messy & nothing is ever perfect or 100% done. Most of us are usually working on architectures that are: - Old & outdated - Hacked together - Mid-migration to new tools - Non-existent Or perhaps you're one of the lucky ones that recently started from scratch and things are running smoothly. Regardless, the best way to learn what's working (and not working) is from others. I believe this could be one of the best insights this community can collectively offer each other. So let's hear it. What does your data stack look like for the following components? 1. Database/Storage 2. Ingestion 3. Transformation 4. Version Control 5. Automation Feel free to add other items as well outside of these 5, but we can focus on these to keep it organized.
8
64
New comment 13d ago
Data Warehousing w/ dbt - A 3 Layered Approach
This is a friendly reminder that the "data grass" isn't always greener on the other side. Everyone is doing their best and every business has their unique challenges. But one thing I've recently noticed is that many teams struggle in the same area - the data warehouse. While there's no one-size-fits all approach, I found myself repeating my recommendation over the past few weeks so figured I'd share here. For context, what I'm going to share is focused around dbt projects. The typical scenario is that a business starts a project on their own but quickly finds themselves with an unorganized and/or unscalable project. Which is how they end up talking to me. At a high level, here's the simple 3 layered approach I follow: > Layer 1: Staging > Layer 2: Warehouse > Layer 3: Marts Staging: - Create 1:1 with each source table (deploy as views to avoid duplicate storage) - Light transformations for modularity (ex. renaming columns, simple case-whens, conversions) - Break down into sub-folders by source system - Deploy to a Staging schema models/staging/[source-system] Warehouse: - Pull from Staging layer (simple transforms already handled) - Facts: Keys & Metrics (numeric values) - Dimensions: Primary Key & Context (descriptive, boolean, date values) - Deploy to a single Warehouse schema models/warehouse/facts models/warehouse/dimensions Marts: - Pull from Warehouse (facts & dims allow for simple joins) - Create wide tables w/ multiple use cases (vs 1:1 for each report) - Either deploy to a single Mart schema or break up by business unit/user grouping models/marts (or) models/marts/[business-unit] This doesn't cover other important topics like Environments, CI/CD & Documentation. But if you're also working on your own project or considering approaches, hopefully this will help! Other dbt users - how do you structure your project?
3
3
New comment 2h ago
ETL Recommendations
Hey all My company currently has a bunch of lambda functions in aws that extract data from APIs into S3 and then into snowflake. The Lambda function process is working but has limitations. a) Its has a complex set up and to make changes it takes lots of time, b) Monitoring isn't very visible. c) CDC is a challange to manage Since i probably wont be able to get the company to pay for a new tool to do the ETL i need to think of some free tools I can recomend that makes the pipeline robust and easy to monitor add sources and make changes quickly.. I am looking at Airbyte - any advice on this? What alternatives other alternatives are there perhaps?
1
3
New comment 24h ago
Metric Layer in dbt Core.
Hey team! One of my companies goal the following quarter will be to define an aggregated metric layer for rapid insight of business important metrics. I've done something similar previous but not with dbt. I'd like to know if anyone has done this with dbt core and how would be the best way to approach it. Thanks!
1
1
New comment 1d ago
1-30 of 74
A community of data professionals building architectures with modern tools & strategies.
Leaderboard (30-day)
powered by