Activity
Mon
Wed
Fri
Sun
Aug
Sep
Oct
Nov
Dec
Jan
Feb
Mar
Apr
May
Jun
Jul
What is this?
Less
More

Memberships

Data Innovators Exchange

443 members • Free

7 contributions to Data Innovators Exchange
Data Vault is dead!
I hear more and more people asking why you should do all the overhead of creating a Data Vault on modern data platforms. They often argue, that when having a persistent data lake and a modern data platform, which allows virtualizing everything on top of the lake, you don't need a Data Vault. What's your take on this?
4 likes • Feb 10
Again, they are talking about technologies not methodologies. People forget DV 2.x is not just a way to model data but architecture and methodology too. All of it still applies in the modern data landscape. Dan's new DV 2.1 course now includes discussions about data lakes and PSAs and virtualization. In the end you still have to a plan and approach and at a minimum you need the business ontology and taxonomy to properly build a virtualization layer that makes sense to the business. You can use DV for all of that! A logical DV model can certainly be used to build the model inside a virtualization tool.
Looking for Resources on Datalakehouse Architectures in Snowflake
I am currently diving deeper into datalakehouse architectures and looking for resources or implementation examples specifically related to building lakehouses on Snowflake. So far, I have primarily focused on the architecture of Databricks on Azure and AWS, and now I want to explore Snowflake in more detail to identify commonalities and significant differences, such as Delta Lake vs. Apache Iceberg. If anyone has good sources, articles, or personal insights, I would greatly appreciate your input!
3 likes • Jan 11
Check Snowflake.com/blog for lake house articles. Also their resources page for white papers and presos. I know some of the Field CTOs and SE have done talks and posts over the last few years. Check LinkedIn postings from Kevin Bair at Snowflake as he works exclusively on data cloud architecture these days.
Sometimes you read (IMHO) an uniformed opinion about Data Vault that just makes you shrug
For me it was this: ”Moreover, there seems to be little justification for adopting a Data Vault model, especially considering the flexibility of the lakehouse architecture.” Pity its about to be published in an O’Reily book. https://learning.oreilly.com/library/view/-/9781098178826/ch02.html
2 likes • Nov '24
Sadly too many people do not understand the difference between an architecture (e.g., Medallion) and a methodology like Data Vault 2.x.
Seeking tips on Data Mesh & Fabric Organization
"I'm currently diving deeper into the concepts of Data Mesh and Data Fabric, especially from an organizational and strategic perspective. Does anyone have good resources or reading recommendations on this topic? I'm particularly interested in the organizational aspects but also open to resources that focus more on the architectural approaches. Thanks in advance!
6 likes • Oct '24
Check out https://datameshlearning.com/ . They also have a Slack channel with lots of activity on it.
Modelling Address Data
The question arose as to the correct way to model address data. Michael Olschimke explained his point of view and shared his idea on how to proceed in the latest session of Data Vault Friday. You can find the video either here in the classroom or below. How would you model address data? Would you do it differently to Michael?
3 likes • Sep '24
Usually, I prefer to put address attributes in a Sat. However, I did have this same case in a Healthcare warehouse because there were multiple specialty clinics housed at the same address. The BK was very similar to the example and it really was the best way to identify a building owned by the agency, and was how the business thought of it. We used the full set of attributes for the BK and hashed then according to the DV 2 best practices. I would never recommend using a surrogate key for the address BK (or any Hub actually) as that really will not work unless you have only one source system for addresses over time. Very unlikely, hence the use of true BK attributes to future proof your design. I do like the idea of using a JSON document with all the attributes as the BK and then hash that for the Hub PK. I did not have that as an option at the time. With the VARIANT data type in Snowflake, that is now a valid option to consider. However, as Michael points out, if the keys within the JSON get reordered, you might end up loading technical duplicates because they will hash to different values. In that case you may need a Same as Link to align them in your Business Vault.
1-7 of 7
Kent Graziano
3
39points to level up
@kent-graziano-4146
Semi-retired Snowflake and Data Vault evangelist. Author, speaker, advisor.

Active 37d ago
Joined Jul 25, 2024
Powered by