Activity
Mon
Wed
Fri
Sun
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
What is this?
Less
More

Memberships

Modern Data Community

Public • 573 • Free

4 contributions to Modern Data Community
Full-Stack Dev -> Data Eng... dbt/Data Modeling is hard. Guidance?
Hey guys, I'm a Full Stack Dev switching to Data Eng. I am stuck on dbt + Data Modeling. Idk how to solve simple problems from a Data Modeling approach. Like... something as simple as "revenue" gives me trouble. If my initial DB has these tables, how might I design a "star schema" for the purposes of revenue? - Sales (refs a Product and a Customer) - Expenses - Products - Customers I'd want to support things like... - Daily, Weekly, Monthly Revenue, etc (this is called the "Date Dimension"?) - Revenue by Customer, or product (customer/product dimensions?) - Advanced stuff (like Profit, Rev growth, etc... but all capable of applying the above dimensions as well) How would such a Schema be designed? Is "star schema" the way to go? What would normally be consider the most "granular" in such a case? "Sales"? or a new "Revenue" model. What does a "Revenue" model even look like? Am I even allowed to do that? Does that even make sense? Like... although it'd be very nice to have such a model, idk how it'd get attached to any of the dimensions. Is a "star schema" even the way to go? And... aside from my specific example... are there any general rules, tips, best practices that come to mind while reading that anyone can share to help me learn? From Full-stack / backend dev, Schema design was always about coming up with those initial tables, and I'd solve problems using typical coding (functions, algorithms, etc). This way with Data Eng feels new and different. I imagine other's making the Full-stack -> Data transition might also run into similar issues. Any insights to helping ppl in my position stuck on these types of problems would def be appreciated. Thanks!
1
2
New comment Mar 29
2 likes • Mar 25
There's a lot to unpack here. For determining granularity for tables, I suggest using the smallest fact table that your analytics requires. I.e. Revenue day grain. So let's say you build that and it's a view called vw_revenue_daily_fact. You would then use the vw_revenue_daily_fact as a base for your month/quarter/year. The reason you do this is so that if something breaks, you need only to go to one fact table (day) since it's the base. Makes it easier to manage codebase. From there you'd have a dim_date that would be joined with the facts to group by different grains.
Choosing Between BigQuery, Redshift & Snowflake
Hello everyone, I'm currently in the process of evaluating data warehouse solutions to centralize data from 2 transactional Mysql databases, Events data from Segment, Salesforce & Intercom. Our goal is to load this data into a warehouse, perform some dbt transformation & then connect it to a BI Tool. We're considering BigQuery, Redshift, and Snowflake as potential solutions but are having a tough time deciding which one would be the best fit for our needs. The main key considerations we're looking at are Cost, Ease of Use, Performance and Speed Any recommendation you guys have that might be useful!
1
12
New comment Mar 17
3 likes • Feb 27
Ease of use- probably Snowflake Cost- this depends on your personal use case. There are many factors that can inflate/deflate this. Most cloud warehouses have similar offerings but there are key differences with compute costs (like running queries) vs storage costs (typically cheap). Snowflake has some snowpipes (ingestion tools) that might be cheaper than using BQ or Redshift and having to use some other ingestion tool (like fivetran, ADF, Glue, etc). So something to keep in mind is how you are loading data into that warehouse and is there a cost benefit to using something already within that ecosystem. Performance/Speed- this is probably negligible between them all. You can scale up/down compute and storage on all these but depends on the pricing tier. Hope that answer wasn't too ambiguous and good luck!
The 4 Tiers of Getting Paid As An Independent Data Engineer
A common myth: All independent consultants are hired directly by the client. In reality, there could be multiple layers in between. And the more layers there are, the more other firms need to make money too. That means the less you'll earn as an individual. Here are the 4 Tiers of getting paid as an independent consultant: ==== Tier 1: Direct with Client: The best case scenario (IMO) is you personally find a client and sign a contract directly with them. This is by far the most difficult to land but pays the most as there are no middle-men. Tier 2: Sub-Contract via Consulting Company: Many clients have established relationships with big consulting companies and view them as "consultants", not contractors. You can join an existing project and invoice the consulting company (not the client). Tier 3: Staffing Firm Hiring for Client: This is similar to Tier 2, but with one critical difference. Unlike Tier 2, in this scenario you're viewed more as a "contractor" rather than a consultant, which drives down the perceived value and rate you can charge. Tier 4: Staffing Firm Hiring For Consulting Company: If a consulting company needs help staffing their project, they will reach out for sourcing help. This means there are now 2 layers between you and the client - and each wants to make money. ==== Money isn't everything, and sometimes it's nice to have others find work for you. But be ready to adjust your rates accordingly.
11
2
New comment Feb 26
3 likes • Feb 26
When subcontracting- do you find they often don't want to pay your LLC? It often seems like consulting companies don't necessarily want to pay another 'company' for some reason.
[Start Here] Welcome to The Modern Data Community!
Hello! Welcome to The Modern Data Community. The goal of this community is to help Data Engineers on small (or solo) teams confidently build modern architectures by simplifying key concepts, clarifying common strategies & learning from others. Pumped to have you here! ==================== HOW IT WORKS ==================== By joining, you get instant access to tons of free content (see Classroom). Dive right in. But even more can be unlocked by contributing to the Community, which I encourage you to do. It works like this: Contribute (post/comment) >> Get points (likes) >> Reach new levels >> Unlock content ==================== 6 SIMPLE GUIDELINES ==================== ❌ Do not post error messages looking for others to debug your code. That's why Stack Overflow and other tool-specific Slack channels exist. ❌ Do not use this community for self-promotion (unless Admin approved). We all know it when we see it. ❌ Do not create low-quality posts with poor grammar/spelling or that provide little (or no) value. These will be deleted. You can do better! ✅ Ask questions, share your experiences & overall be a good person. This not only helps everyone get better, but can help you unlock bonus content faster. Win-Win. ✅ Speaking of wins, share yours! Whether it's finally solving a complex problem, hitting a team milestone or starting a new gig - post about it. You'll get the props you deserve and it just might inspire somebody else. ✅ Take the time to craft thoughtful posts & always proof-read before hitting submit. We're all about quality here. High quality posts --> more engagement (aka you'll climb the leaderboard & unlock content) --> ensures the community stays enjoyable for everyone. ==================== QUICK LINKS ==================== Here are a few links to help you get going: - Classroom - What's Your Data Stack? - Leaderboard - Work with me (Kahan Data Solutions)
31
123
New comment 20d ago
3 likes • Feb 26
Hey all I’m Max, a senior data engineer based out of Nashville TN. I’ve been in data for about 6 years, transitioned from a completely different career (theft and fraud investigations, boy do I have some stories about that!). I currently work at a healthcare startup owned by Alphabet, and also do consulting with enterprise companies on setting up their data warehouses. I love all things SQL, data modeling and trying to learn and grow as much as I can as a data engineer. In my free time I love doing Brazilian Jiu Jitsu and hanging out with my wife and two kids. If you’d like to reach out, feel free to shoot me a message at max@mindfuldatastrategy.com
1-4 of 4
Max Walzenbach
2
8points to level up
@max-walzenbach-1768
I've been a data engineer for a Alphabet company and have my own data consultant, specializing in all things SQL and data modeling.

Active 56d ago
Joined Feb 19, 2024
powered by