cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks recommended Approach to load data vault 2.0

Subha0920
New Contributor II

Hi,

Please share the recommended approach to load Data Vault 2.0 .

Overview

1. Current Landscape -  Lakehouse (Bronze/Silver/Gold)

2. Data Vault 2.0 to be created in Silver layer.

3. Bronze data will be made available in delta table using ETL 

Questions

1. What should be the strategy to load the data from Bronze to Silver layer 

2. Approach to adopt to parallelize the load  while loading the data vault 2.0 tables.

3.How to pick the incremental the data from delta tables while loading Silver layer.

4a)How can we reuse the Notebooks to load the Silver layer (Data Vault 2.0) for other source system.

b)Where should the logic to be encapsulated while populating hub/link/satellite table for every entity . ex views

c)How to configure the DQ Rules for every entity / tables

5. What type of meta data driven approach can be adopted.

6. What should be convention to adopt for Unity Catalog 

ex - Unity Catalog Name - Bronze , Schema Name- Source System Name, Tables - Tables for every source.

Unity Catalog Name - Silver , Schema - what need to be provided . Tables - Data Vault 2.0 tables.

7. Exception Handling / Reprocessing from the point of failure / Auditing

8. Cluster Configuration (All purpose Cluster ) / Warehouse Cluster

3 REPLIES 3

ilir_nuredini
Honored Contributor

Hello @Subha0920 ,

I have implemented previously data vault 2.0 in Databricks, even though it can be too long to mention all the details
of the implementation, but what helped us to get a lot of insights are these resource by Microsoft:
Data Vault 2.0 using Databricks Lakehouse Architecture on Azure

What’s a Data Vault and How to Implement It on the Databricks Lakehouse Platform - The Databricks Bl...

They may be a bit old articles, but they are quite a helpful ones.

Hope that helps a bit.

Best, Ilir

Thanks @ilir_nuredini . It is helpful.

  If you can share the details for the above questions, that will assist to plan further.

Subha0920
New Contributor II

Kindly provide your valuable input and suggestion for the above questions

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now