cancel
Showing results for 
Search instead for 
Did you mean: 

How to pass configuration values to a Delta Live Tables job through the Delta Live Tables API

labromb
Contributor

Hi Community,

I have successfully run a job through the API but would need to be able to pass parameters (configuration) to the DLT workflow via the API

I have tried passing JSON in this format:

{ 
    "full_refresh": "true",
    "configuration": [ 
        {
               "config1": "config1_value",
               "config2": "config2_value"
          }
    ]
}
 
 
 

The API seems happy with the structure of the JSON but config1 and config2 are not being overridden

Any help greatly appreciated.

1 ACCEPTED SOLUTION

Accepted Solutions

Mo
Valued Contributor
Valued Contributor

hey @labromb ,

have you seen this documentation page?

https://docs.databricks.com/api/workspace/pipelines/update

?

also for configuration, there is no need for square brackets:

"configuration": {
"property1": "string",
"property2": "string"
},

let me know if this works 😉

 

View solution in original post

10 REPLIES 10

Kaniz
Community Manager
Community Manager

Hi @Brian Labrom​, To pass parameters to your Databricks job via the API, you can use the --conf option when launching a job.

You'll need to modify the notebook_task configuration in your job to pass these parameters as arguments to the notebook.

First, you can access the parameters in your notebook using dbutils.widgets.get().

Here's an example:

full_refresh = dbutils.widgets.get("full_refresh") == "true"
config1 = dbutils.widgets.get("config1")
config2 = dbutils.widgets.get("config2")

Now, when you submit the job through the API, pass the parameters in the

notebook_task section like this:

{
  "name": "My Job",
  "new_cluster": {
    "spark_version": "x.x.x-scala2.x",
    "node_type_id": "node_type",
    "num_workers": 1
  },
  "notebook_task": {
    "notebook_path": "/path/to/your/notebook",
    "base_parameters": {
      "full_refresh": "true",
      "config1": "config1_value",
      "config2": "config2_value"
    }
  }
}

Replace /path/to/your/notebook with the path to your laptop, and modify the spark_version, node_type_id, and num_workers according to your requirements.

If you're using Python or a different language for your API call, make sure to adjust the code accordingly. For example, in Python, you could use the requests library to submit the job like this:

import json
import requests
 
api_key = "your_databricks_token"
api_url = "https://your_databricks_instance/api/2.0/jobs/runs/submit"
headers = {
    "Authorization": f"Bearer {api_key}",
    "Content-Type": "application/json"
}
 
job_config = {
  "name": "My Job",
  "new_cluster": {
    "spark_version": "x.x.x-scala2.x",
    "node_type_id": "node_type",
    "num_workers": 1
  },
  "notebook_task": {
    "notebook_path": "/path/to/your/notebook",
    "base_parameters": {
      "full_refresh": "true",
      "config1": "config1_value",
      "config2": "config2_value"
    }
  }
}
 
response = requests.post(api_url, headers=headers, data=json.dumps(job_config))

Remember to replace your_databricks_token, your_databricks_instance, and other placeholders with your actual values.

Hi @Kaniz Fatma​,

Just wondered if there was any update on this. This is quite an important aspect of how we would implement DLT pipelines so would be good to know if it can be done, or if it's coming.

Many thanks.

labromb
Contributor

Hi @Kaniz Fatma​, thanks for the detailed reply. Looks like the response is talking about a job, not a delta live tables pipeline. Apologies if my initial question was not clear enough...

I am using the Delta Live Tables API:

Delta Live Tables API guide - Azure Databricks | Microsoft Learn

And want to refresh a DLT pipeline... I can initiate a refresh but but need to be able to override the configuration of the DLT pipeline with the values I supply.

I am using Azure Data Factory to call the API so just need to know what the JSON format needs to be in the request body so I can override the parameters

Manjula_Ganesap
New Contributor III

@labromb - Please let me know if you found a solution to your problem. I'm trying to do the same thing. 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.