Activity
Mon
Wed
Fri
Sun
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
What is this?
Less
More

Memberships

Modern Data Community

Public • 573 • Free

2 contributions to Modern Data Community
Ratio of data engineers to analytics engineers
What is the ratio of people in your organization who are working on your data integration layer vs your data modeling/transformation layer vs BI/Visualization layer? Regardless of job title, be it data engineer, analytics engineer, data analyst, reporting analyst, whatever, how many people in your stack are working just on data integration vs modeling? I ask because I've seen this fluctuate in the orgs I'm in based on tooling choice and the health of the system. In the best, most agile org, the ratio of data engineers to analytics engineers was 1:3 to 1:4. In the least agile, most frustrating (for the business stakeholders waiting on data and reports), the ratio was 3:1 to 4:1, totally reversed. Thus far, this has been a tooling issue. The choice of data integration tool caused an enormous amount of manual work to acquire data and put together all the orchestration pieces etc before it was available for modeling or transformation. Help me fight my bias and tell me what you've seen in your previous roles and in your current role
0
2
New comment 16d ago
Data Warehousing w/ dbt - A 3 Layered Approach
This is a friendly reminder that the "data grass" isn't always greener on the other side. Everyone is doing their best and every business has their unique challenges. But one thing I've recently noticed is that many teams struggle in the same area - the data warehouse. While there's no one-size-fits all approach, I found myself repeating my recommendation over the past few weeks so figured I'd share here. For context, what I'm going to share is focused around dbt projects. The typical scenario is that a business starts a project on their own but quickly finds themselves with an unorganized and/or unscalable project. Which is how they end up talking to me. At a high level, here's the simple 3 layered approach I follow: > Layer 1: Staging > Layer 2: Warehouse > Layer 3: Marts Staging: - Create 1:1 with each source table (deploy as views to avoid duplicate storage) - Light transformations for modularity (ex. renaming columns, simple case-whens, conversions) - Break down into sub-folders by source system - Deploy to a Staging schema models/staging/[source-system] Warehouse: - Pull from Staging layer (simple transforms already handled) - Facts: Keys & Metrics (numeric values) - Dimensions: Primary Key & Context (descriptive, boolean, date values) - Deploy to a single Warehouse schema models/warehouse/facts models/warehouse/dimensions Marts: - Pull from Warehouse (facts & dims allow for simple joins) - Create wide tables w/ multiple use cases (vs 1:1 for each report) - Either deploy to a single Mart schema or break up by business unit/user grouping models/marts (or) models/marts/[business-unit] This doesn't cover other important topics like Environments, CI/CD & Documentation. But if you're also working on your own project or considering approaches, hopefully this will help! Other dbt users - how do you structure your project?
6
9
New comment 19d ago
1 like • 21d
@Michael Kahan I'm still onboarding here in my new role, so speaking purely of previous implementations, yeah there was a a lot of complex logic on very large tables. One role we had 15,000+ concurrent users on the reporting tool and hundreds of TBs in snowflake, so we could not use views, our reporting layer had to have very well clustered materialized tables
1 like • 21d
I will say though that views should be the first version of the dbt layers, and only materialize if and only if absolutely necessary. just my 0.02
1-2 of 2
Jay Archer
1
2points to level up
@jay-archer-3963
Data Architect, Data Engineer, and Analytics Engineer

Active 14d ago
Joined Apr 26, 2024
powered by