cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How dynamically pass a string parameter to a Delta Live Table Pipeline when calling from Azure Data Factory using REST API

rubenesanchez
New Contributor II

I want to pass some context information to the delta live tables pipeline when calling from Azure Data Factory. I know the body of the API call supports Full Refresh parameter but I wonder if I can add my own custom parameters and how this can be retrieved dynamically from the running notebook attached to pipeline

Any help appreciated

Thanks in advance

6 REPLIES 6

Anonymous
Not applicable

@Ruben Sanchezโ€‹ :

Yes, you can pass custom parameters to a Delta Live Table pipeline when calling it from Azure Data Factory using the REST API. One way to achieve this is by adding the custom parameters to the body of the API call as a JSON object. You can then retrieve these custom parameters dynamically from the running notebook attached to the pipeline.

Here is an example of how you can pass custom parameters to a Delta Live Table pipeline using the REST API:

  1. In your ADF pipeline, add an HTTP activity to call the REST API endpoint for the Delta Live Table pipeline. In the body of the API call, include a JSON object with your custom parameters, like so:
{
  "fullRefresh": true,
  "customParameter1": "value1",
  "customParameter2": "value2"
}

2) In your Delta Live Table pipeline, retrieve the custom parameters from the body of the API call using the dbutils.widgets.get method. This method retrieves the value of a widget parameter that was passed to the notebook, which in this case is the custom parameters object.

import json
 
# Retrieve the custom parameters from the notebook widgets
custom_params_json = dbutils.widgets.get("custom_params")
custom_params = json.loads(custom_params_json)
 
# Retrieve the values of the custom parameters
custom_parameter1 = custom_params.get("customParameter1")
custom_parameter2 = custom_params.get("customParameter2")
 
# Use the custom parameters in your pipeline logic
...

In this example, we first retrieve the custom parameters object from the notebook widget using the

dbutils.widgets.get method. We then parse the JSON string into a Python dictionary using the

json.loads method. Finally, we retrieve the values of the custom parameters using the get method on the dictionary, and use them in our pipeline logic.

Note that the custom parameter keys should be unique and not conflict with any reserved keywords used by the Delta Live Table pipeline, such as "fullRefresh" in the example above.

I hope this helps! Let me know if you have any further questions.

Suteja

Thanks a lot for you response

Sorry I was only now that was able to test the solution.

I tried the code you supplied in my Delta Live Table notebook but got an error

Code:

import dlt

from pyspark.sql.functions import *

from pyspark.sql.types import *

from datetime import datetime

import json

custom_params_json = dbutils.widgets.get("custom_params")

custom_params = json.loads(custom_params_json)

runid = custom_params.get("ADFrunid")

Error:

Py4JJavaError: An error occurred while calling o374.getArgument. : com.databricks.dbutils_v1.InputWidgetNotDefined: No input widget named custom_params is defined

If I add the widget declaration the code runs without error:

dbutils.widgets.text("custom_params", """{"ADFrunid":"none"}""")

But the notebook does not get the value passed and only resolves the default value

This is the code for the invocation from Azure Data Factory to the API (some key values suppressed for security reasons)

{

   "url": https://adb-XXXXXX.azuredatabricks.net/api/2.0/pipelines/a8996a9d-a5f2-48ef-ada0-0ba98da037e7/update...,

   "method": "POST",

   "headers": {

       "Authorization": "Bearer XXXXXXX"

   },

   "body": "{\n \"fullRefresh\": false,\n \"ADFrunid\": \"6b0a0e74-0f99-463c-9ff0-74fc17ea8958\"\n}"

}

Thanks again for your help

Ruben Sanchez

Anonymous
Not applicable

Hi @Ruben Sanchezโ€‹ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance! 

Manjula_Ganesap
Contributor

@rubenesanchez  - Did you find a solution to your problem? I have the same question. 

Vamshikrishna_r
New Contributor II

@rubenesanchez . I am also not able to fetch the details passed from ADF to DLT notebook. Were you able to resolve?

BLM
New Contributor II

In case this helps anyone, I only could use the refresh_selection parameter setting it to [] by default. Then, in the notebook, I derived the custom parameter values from the refresh_selection value.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group