Databricks Community

Edthehead · ‎03-15-2024

I'm trying to create an ETL framework on delta live tables and basically use the same pipeline for all the transformation from bronze to silver to gold.

This works absolutely fine when I hard code the tables and the SQL transformations as an array within the notebook itself. Now I need to put this config in another location so it can be easily maintained without touching the notebook.

I can't use a json file because I may have multiple transformations which need to be executed in sequence so I need to sort the transformations from this table and then execute them one by one.

When I try putting this in another delta table, I'm getting error while trying to convert this into a pandas dataframe to iterate over the rows.

I referred to this article https://docs.databricks.com/en/delta-live-tables/create-multiple-tables.html but even this has the config hard coded in the pipeline. Is there any example of this kind of use case or is there an alternative to using the config source from outside the DLT pipeline?

Edthehead · ‎03-16-2024

Thanks but I found what I was looking for in THIS article. This shows how to access a Metadata Delta table which is maintained outside the pipeline and convert the data to a dictionary that can be used to do different processing within the pipeline.