Kent Graziano

Data Innovators Exchange

Activity

Mon

Wed

Fri

Sun

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

What is this?

Less

Memberships

Data Innovators Exchange

587 members • Free

7 contributions to Data Innovators Exchange

Tim Kirschke

Feb 5 •

Ask your community

Data Vault is dead!

I hear more and more people asking why you should do all the overhead of creating a Data Vault on modern data platforms. They often argue, that when having a persistent data lake and a modern data platform, which allows virtualizing everything on top of the lake, you don't need a Data Vault. What's your take on this?

New comment Apr 19

Kent Graziano

4 likes • Feb 10

Again, they are talking about technologies not methodologies. People forget DV 2.x is not just a way to model data but architecture and methodology too. All of it still applies in the modern data landscape. Dan's new DV 2.1 course now includes discussions about data lakes and PSAs and virtualization. In the end you still have to a plan and approach and at a minimum you need the business ontology and taxonomy to properly build a virtualization layer that makes sense to the business. You can use DV for all of that! A logical DV model can certainly be used to build the model inside a virtualization tool.

Lennart Busche

Jan 10 •

General

Looking for Resources on Datalakehouse Architectures in Snowflake

I am currently diving deeper into datalakehouse architectures and looking for resources or implementation examples specifically related to building lakehouses on Snowflake. So far, I have primarily focused on the architecture of Databricks on Azure and AWS, and now I want to explore Snowflake in more detail to identify commonalities and significant differences, such as Delta Lake vs. Apache Iceberg. If anyone has good sources, articles, or personal insights, I would greatly appreciate your input!

New comment Jan 16

Kent Graziano

3 likes • Jan 11

Check Snowflake.com/blog for lake house articles. Also their resources page for white papers and presos. I know some of the Field CTOs and SE have done talks and posts over the last few years. Check LinkedIn postings from Kevin Bair at Snowflake as he works exclusively on data cloud architecture these days.

Shane Gibson

Nov '24 •

General

Sometimes you read (IMHO) an uniformed opinion about Data Vault that just makes you shrug

For me it was this: ”Moreover, there seems to be little justification for adopting a Data Vault model, especially considering the flexibility of the lakehouse architecture.” Pity its about to be published in an O’Reily book. https://learning.oreilly.com/library/view/-/9781098178826/ch02.html

New comment Nov '24

Kent Graziano

2 likes • Nov '24

Sadly too many people do not understand the difference between an architecture (e.g., Medallion) and a methodology like Data Vault 2.x.

Lennart Busche

Oct '24 •

Ask your community

Seeking tips on Data Mesh & Fabric Organization

"I'm currently diving deeper into the concepts of Data Mesh and Data Fabric, especially from an organizational and strategic perspective. Does anyone have good resources or reading recommendations on this topic? I'm particularly interested in the organizational aspects but also open to resources that focus more on the architectural approaches. Thanks in advance!

New comment Oct '24

Kent Graziano

6 likes • Oct '24

Check out https://datameshlearning.com/ . They also have a Slack channel with lots of activity on it.

Lina Sibbel

Sep '24 •

General

Modelling Address Data

The question arose as to the correct way to model address data. Michael Olschimke explained his point of view and shared his idea on how to proceed in the latest session of Data Vault Friday. You can find the video either here in the classroom or below. How would you model address data? Would you do it differently to Michael?

New comment Sep '24

Kent Graziano

3 likes • Sep '24

Usually, I prefer to put address attributes in a Sat. However, I did have this same case in a Healthcare warehouse because there were multiple specialty clinics housed at the same address. The BK was very similar to the example and it really was the best way to identify a building owned by the agency, and was how the business thought of it. We used the full set of attributes for the BK and hashed then according to the DV 2 best practices. I would never recommend using a surrogate key for the address BK (or any Hub actually) as that really will not work unless you have only one source system for addresses over time. Very unlikely, hence the use of true BK attributes to future proof your design. I do like the idea of using a JSON document with all the attributes as the BK and then hash that for the Hub PK. I did not have that as an option at the time. With the VARIANT data type in Snowflake, that is now a valid option to consider. However, as Michael points out, if the keys within the JSON get reordered, you might end up loading technical duplicates because they will hash to different values. In that case you may need a Same as Link to align them in your Business Vault.

1-7 of 7

Level 3 - Investigator

39points to level up

Kent Graziano

@kent-graziano-4146

Semi-retired Snowflake and Data Vault evangelist. Author, speaker, advisor.

Active 58d ago

Joined Jul 25, 2024

Contributions

Followers

Following