Hi,
Please share the recommended approach to load Data Vault 2.0 .
Overview
1. Current Landscape - Lakehouse (Bronze/Silver/Gold)
2. Data Vault 2.0 to be created in Silver layer.
3. Bronze data will be made available in delta table using ETL
Questions
1. What should be the strategy to load the data from Bronze to Silver layer
2. Approach to adopt to parallelize the load while loading the data vault 2.0 tables.
3.How to pick the incremental the data from delta tables while loading Silver layer.
4a)How can we reuse the Notebooks to load the Silver layer (Data Vault 2.0) for other source system.
b)Where should the logic to be encapsulated while populating hub/link/satellite table for every entity . ex views
c)How to configure the DQ Rules for every entity / tables
5. What type of meta data driven approach can be adopted.
6. What should be convention to adopt for Unity Catalog
ex - Unity Catalog Name - Bronze , Schema Name- Source System Name, Tables - Tables for every source.
Unity Catalog Name - Silver , Schema - what need to be provided . Tables - Data Vault 2.0 tables.
7. Exception Handling / Reprocessing from the point of failure / Auditing
8. Cluster Configuration (All purpose Cluster ) / Warehouse Cluster