szymon_dybczak
Esteemed Contributor III

Hi @Swathik ,

If we're talking about where to store ETL metadata, in my opinion it's mostly a matter of preference. In my case, I prefer storing my config in YAML files, but I’ve also worked on projects where the config was stored in Delta tables.

For instance, here you can find approach that uses config tables and it's also valid:

Metadata-Driven ETL Framework in Databricks (Part-... - Databricks Community - 92666

Regarding orchestration, I prefer to automate both the bronze and silver layers. So I typically have one generic bronze-layer notebook and one generic silver-layer notebook. Then I iterate over my ETL metadata stored in the YAML config file and load each Delta table using a foreach task in Databricks Workflows.

For gold layer I want to be explicit because usually you have different business logic for your fact and dimension tables. So here I implement independent notebook for each business entity.

View solution in original post