cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

DLT - runtime parameterisation of execution

MartinIsti
New Contributor III

I have started to use DLT in a prototype framework and I now face the below challenge for which any help would be appreciated.

First let me give a brief context:

  • I have metadata sitting in a .json file that I read as the first task and put it into a log table with all the relevant attributes (including the list of tables to be processed by the DLT pipeline)
  • That log table has multiple records including those of past executions so I have to filter it down to the current one using a timestamp (e.g. IngestAdventureWorks_20240314)
  • For that I need to pass that ID as a parameter to the DLT pipeline so it can be used in a SQL query to find the relevant records and built the list of tables to be processed.
  • When I hardcode it as a Key-Value pair during design-time I can access those values easily using the spark.conf.get("ID",  None) syntax
 
My question/challenge is how to pass that parameter using either a task in a workflow (similarly how I can reference prior tasks' output and pass it to a widget in a downstream notebook task) or execute the DLT pipeline using a notebook.
 
That would be really important for me to make the solution really dynamic without hardcoding parameter values.
 
Thanks for any help in advance
 
Istvรกn
1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz
Community Manager
Community Manager

Hi @MartinIstiPassing parameters dynamically in a DLT (Delta Live Tables) pipeline can enhance flexibility and make your solution more robust.

While DLT doesnโ€™t directly support task values for parameters, there are alternative approaches you can consider:

  1. Configuration Parameters:

    • DLT allows you to define configuration parameters in the pipelineโ€™s settings or Spark configuration.
    • You can set these parameters during design time and reference them within your DLT pipeline.
    • These parameters can be accessed using spark.conf.get("parameter_name", None) syntax.
  2. Notebook Widgets:

    • If youโ€™re using notebooks within your workflow, you can leverage widgets.
    • In your notebook, use dbutils.widgets.get("widget_name") it to retrieve the value of a widget.
    • You can set the widget value dynamically based on prior tasksโ€™ output or other logic.
  3. REST API Calls:

    • If you want to pass custom parameters to a DLT pipeline, consider making an HTTP call to the DLT pipelineโ€™s REST API endpoint.
    • In the body of the API call, include a JSON object with your custom parameters.
    • For example:
      { "fullRefresh": true, "customParameter": "your_value" }
      
    • This approach allows you to dynamically set parameters when triggering the DLT pipeline from an external source (e.g., an orchestration tool).
  4. Airflow Integration:

    • If youโ€™re using Apache Airflow, you can run DLT pipelines as part of your workflow.
    • Use the DatabricksSubmitRunOperator to submit a DLT pipeline run.
    • Ensure that you pass the necessary parameters as arguments to the notebook task in your job configuration.
 

View solution in original post

3 REPLIES 3

Kaniz
Community Manager
Community Manager

Hi @MartinIstiPassing parameters dynamically in a DLT (Delta Live Tables) pipeline can enhance flexibility and make your solution more robust.

While DLT doesnโ€™t directly support task values for parameters, there are alternative approaches you can consider:

  1. Configuration Parameters:

    • DLT allows you to define configuration parameters in the pipelineโ€™s settings or Spark configuration.
    • You can set these parameters during design time and reference them within your DLT pipeline.
    • These parameters can be accessed using spark.conf.get("parameter_name", None) syntax.
  2. Notebook Widgets:

    • If youโ€™re using notebooks within your workflow, you can leverage widgets.
    • In your notebook, use dbutils.widgets.get("widget_name") it to retrieve the value of a widget.
    • You can set the widget value dynamically based on prior tasksโ€™ output or other logic.
  3. REST API Calls:

    • If you want to pass custom parameters to a DLT pipeline, consider making an HTTP call to the DLT pipelineโ€™s REST API endpoint.
    • In the body of the API call, include a JSON object with your custom parameters.
    • For example:
      { "fullRefresh": true, "customParameter": "your_value" }
      
    • This approach allows you to dynamically set parameters when triggering the DLT pipeline from an external source (e.g., an orchestration tool).
  4. Airflow Integration:

    • If youโ€™re using Apache Airflow, you can run DLT pipelines as part of your workflow.
    • Use the DatabricksSubmitRunOperator to submit a DLT pipeline run.
    • Ensure that you pass the necessary parameters as arguments to the notebook task in your job configuration.
 

MartinIsti
New Contributor III

Thanks Kaniz to your response. It would have been great to use a similar approach like the widgets to a normal notebook. Specifying these parameters at design time does not allow the flexibility needed for running my DLT pipeline truly metadata-driven.

I was also going towards using the job REST API from a notebook but then I ended up tweaking my configuration tables in a way that I can utilise a hardcoded parameter in the DLT definition and still have it dynamic.

If the REST API call functionality could be integrated into the workflows later on to pass these values as to other tasks, that would be really great!

I accept it as a solution because your third suggestion would work. I still keep hoping a more integrated approach will come in the future ๐Ÿ˜‰

data-engineer-d
New Contributor III

@Kaniz Can you please provide some reference to REST API approach? I do not see that available on the docs. 

TIA

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.