Marc Winkelmann

Data Innovators Exchange

Activity

Mon

Wed

Fri

Sun

Jan

Feb

Mar

Apr

May

Jun

Jul

Aug

Sep

Oct

Nov

Dec

What is this?

Less

Memberships

Data Innovators Exchange

606 members • Free

17 contributions to Data Innovators Exchange

Marc Winkelmann

Feb 10 •

Ask your community

Sh** in - Sh** out?

You can have the best data model and the best business logics in place, but if garbage data comes in, the output is also often useless, especially for unexpected garbage. What are your strategies to deal with that?

New comment Apr 19

Stefanie Culley

Oct '24 •

Ask your community

Agile with lnk tables

Hoping I can get some feedback on an issue we are really struggling with. We are trying to work through adding to our DataVault by adding small pieces at a time which works great with hubs and sats, but breaks down when you get to links. We have some idea of the full build but need to work on projects without having the entire picture. How do we handle adding new keys to an existing Link? Is it better just to add a new object? But then what about all the relationship history that was collected?

New comment Nov '24

Marc Winkelmann

3 likes • Nov '24

Hi @Stefanie Culley , if you don't change the structure of the link (same keys, same grain, same meaning), I would simply add new sources to this already existing Link table. The "high-water-mark" (represented by the load date timestamp) should consider the record source in the group by while loading the Link. This will keep the already loaded history and you will load the history from the new source as well, if exist in a Data Lake for example. With an additional Effectivity Satellite you can still differentiate beween the different sources when it comes to source separated interpretations.

Marc Winkelmann

Oct '24 •

General

How to connect multiple dbt core projects

dbt Cloud has the "dbt mesh" to connect your dbt projects. In the core version, there is no native dbt functionality for that, but there is a Python package called "dbt-loom", developed by Nicholas Yager, to connect your dbt-projects. We have it in place for almost one year and it works really well. This fuctionality is crucial when implementing a Data Mesh paradigm. Have a nice weekend! https://github.com/nicholasyager/dbt-loom

New comment Oct '24

Marc Winkelmann

1 like • Oct '24

Not that I am aware of (for the core version). But yes, the interfaces should be documented, at least in a data catalog tool.

Marc Winkelmann

Oct '24 •

Ask your community

Do you have a Deletion/Masking Strategy?

I am wondering if you have a data deletion or data masking strategy in your projects if you have to delete/mask (personal) data. If yes, how do you find all the places where you have to delete/mask? The place where we usually tag data (as personal or not) is usually done in the metadata (example in Data Vault: when defining the Satellites). So, this could be used as the basis. when using column-level lineage you could figure out where the data is coming from (in case you have a PSA) and where the data goes to (Business Vault - Information Mart). Based on this, a procedure can do the delete/masking/NULLing ... What do you think about this and do you have a tool/mechanism which does that? btw. what would you prefer? 1) NULL the values, 2) remove the whole row in the personal Satellite (but then you have to consider to re-create the PITs as some pointers to the Satellites do not exist anymore) or 3) mask it with a static value (not simple hashing of course), also to see that there was something before and to differentiate to "normal" NULLs? Thanks for your thoughts!

New comment Oct '24

Christof Wenzeritt

Sep '24 •

General

Data Innovator Community Meetups

To all Data Innovators, Where do you want to see the next Data innovator events? Due to the huge success of the community meetups in London and Munich we want to keep them coming. In these events we got very good feedback on the focus of the workshops, so we are going to keep the focus on them and stay with the 1 day format with: - Success story - 2 Workshops - Panel discussion The team currently is brainstorming locations and would love to hear your thoughts on this 🙂 So please let us know in the comments where you want a Data Innovators Event to take place and we will give our best to make it happen 🙂 Thank you for your input!

New comment Oct '24

Marc Winkelmann

0 likes • Oct '24

Vegas!

1-10 of 17

Level 4 - Innovator

73points to level up

Marc Winkelmann

@marc-winkelmann-2004

Hi, my name is Marc and I am implementing Data Platforms with the focus on Data Vault 2.0. Looking forward to chat/talk with you :)

Active 34d ago

Joined Jun 27, 2024

Hanover, Germany

Contributions

Followers

Following