04-15-2023 08:33 AM
I want to pass some context information to the delta live tables pipeline when calling from Azure Data Factory. I know the body of the API call supports Full Refresh parameter but I wonder if I can add my own custom parameters and how this can be retrieved dynamically from the running notebook attached to pipeline
Any help appreciated
Thanks in advance
04-16-2023 12:06 AM
@Ruben Sanchez :
Yes, you can pass custom parameters to a Delta Live Table pipeline when calling it from Azure Data Factory using the REST API. One way to achieve this is by adding the custom parameters to the body of the API call as a JSON object. You can then retrieve these custom parameters dynamically from the running notebook attached to the pipeline.
Here is an example of how you can pass custom parameters to a Delta Live Table pipeline using the REST API:
{
"fullRefresh": true,
"customParameter1": "value1",
"customParameter2": "value2"
}
2) In your Delta Live Table pipeline, retrieve the custom parameters from the body of the API call using the dbutils.widgets.get method. This method retrieves the value of a widget parameter that was passed to the notebook, which in this case is the custom parameters object.
import json
# Retrieve the custom parameters from the notebook widgets
custom_params_json = dbutils.widgets.get("custom_params")
custom_params = json.loads(custom_params_json)
# Retrieve the values of the custom parameters
custom_parameter1 = custom_params.get("customParameter1")
custom_parameter2 = custom_params.get("customParameter2")
# Use the custom parameters in your pipeline logic
...
In this example, we first retrieve the custom parameters object from the notebook widget using the
dbutils.widgets.get method. We then parse the JSON string into a Python dictionary using the
json.loads method. Finally, we retrieve the values of the custom parameters using the get method on the dictionary, and use them in our pipeline logic.
Note that the custom parameter keys should be unique and not conflict with any reserved keywords used by the Delta Live Table pipeline, such as "fullRefresh" in the example above.
I hope this helps! Let me know if you have any further questions.
05-14-2023 05:03 PM
Suteja
Thanks a lot for you response
Sorry I was only now that was able to test the solution.
I tried the code you supplied in my Delta Live Table notebook but got an error
Code:
import dlt
from pyspark.sql.functions import *
from pyspark.sql.types import *
from datetime import datetime
import json
custom_params_json = dbutils.widgets.get("custom_params")
custom_params = json.loads(custom_params_json)
runid = custom_params.get("ADFrunid")
Error:
Py4JJavaError: An error occurred while calling o374.getArgument. : com.databricks.dbutils_v1.InputWidgetNotDefined: No input widget named custom_params is defined
If I add the widget declaration the code runs without error:
dbutils.widgets.text("custom_params", """{"ADFrunid":"none"}""")
But the notebook does not get the value passed and only resolves the default value
This is the code for the invocation from Azure Data Factory to the API (some key values suppressed for security reasons)
{
"method": "POST",
"headers": {
"Authorization": "Bearer XXXXXXX"
},
"body": "{\n \"fullRefresh\": false,\n \"ADFrunid\": \"6b0a0e74-0f99-463c-9ff0-74fc17ea8958\"\n}"
}
Thanks again for your help
Ruben Sanchez
04-17-2023 02:20 AM
Hi @Ruben Sanchez
Thank you for posting your question in our community! We are happy to assist you.
To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?
This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!
08-23-2023 06:46 AM
@rubenesanchez - Did you find a solution to your problem? I have the same question.
08-27-2024 01:39 AM
@rubenesanchez . I am also not able to fetch the details passed from ADF to DLT notebook. Were you able to resolve?
a month ago
In case this helps anyone, I only could use the refresh_selection parameter setting it to [] by default. Then, in the notebook, I derived the custom parameter values from the refresh_selection value.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group