DLT - runtime parameterisation of execution

MartinIsti · ‎03-13-2024

I have started to use DLT in a prototype framework and I now face the below challenge for which any help would be appreciated.

First let me give a brief context:

I have metadata sitting in a .json file that I read as the first task and put it into a log table with all the relevant attributes (including the list of tables to be processed by the DLT pipeline)
That log table has multiple records including those of past executions so I have to filter it down to the current one using a timestamp (e.g. IngestAdventureWorks_20240314)
For that I need to pass that ID as a parameter to the DLT pipeline so it can be used in a SQL query to find the relevant records and built the list of tables to be processed.
When I hardcode it as a Key-Value pair during design-time I can access those values easily using the spark.conf.get("ID", None) syntax

My question/challenge is how to pass that parameter using either a task in a workflow (similarly how I can reference prior tasks' output and pass it to a widget in a downstream notebook task) or execute the DLT pipeline using a notebook.

That would be really important for me to make the solution really dynamic without hardcoding parameter values.

Thanks for any help in advance

István