cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to pass configuration values to a Delta Live Tables job through the Delta Live Tables API

labromb
Contributor

Hi Community,

I have successfully run a job through the API but would need to be able to pass parameters (configuration) to the DLT workflow via the API

I have tried passing JSON in this format:

{ 
    "full_refresh": "true",
    "configuration": [ 
        {
               "config1": "config1_value",
               "config2": "config2_value"
          }
    ]
}
 
 
 

The API seems happy with the structure of the JSON but config1 and config2 are not being overridden

Any help greatly appreciated.

8 REPLIES 8

Kaniz
Community Manager
Community Manager

Hi @Brian Labrom​, To pass parameters to your Databricks job via the API, you can use the --conf option when launching a job.

You'll need to modify the notebook_task configuration in your job to pass these parameters as arguments to the notebook.

First, you can access the parameters in your notebook using dbutils.widgets.get().

Here's an example:

full_refresh = dbutils.widgets.get("full_refresh") == "true"
config1 = dbutils.widgets.get("config1")
config2 = dbutils.widgets.get("config2")

Now, when you submit the job through the API, pass the parameters in the

notebook_task section like this:

{
  "name": "My Job",
  "new_cluster": {
    "spark_version": "x.x.x-scala2.x",
    "node_type_id": "node_type",
    "num_workers": 1
  },
  "notebook_task": {
    "notebook_path": "/path/to/your/notebook",
    "base_parameters": {
      "full_refresh": "true",
      "config1": "config1_value",
      "config2": "config2_value"
    }
  }
}

Replace /path/to/your/notebook with the path to your laptop, and modify the spark_version, node_type_id, and num_workers according to your requirements.

If you're using Python or a different language for your API call, make sure to adjust the code accordingly. For example, in Python, you could use the requests library to submit the job like this:

import json
import requests
 
api_key = "your_databricks_token"
api_url = "https://your_databricks_instance/api/2.0/jobs/runs/submit"
headers = {
    "Authorization": f"Bearer {api_key}",
    "Content-Type": "application/json"
}
 
job_config = {
  "name": "My Job",
  "new_cluster": {
    "spark_version": "x.x.x-scala2.x",
    "node_type_id": "node_type",
    "num_workers": 1
  },
  "notebook_task": {
    "notebook_path": "/path/to/your/notebook",
    "base_parameters": {
      "full_refresh": "true",
      "config1": "config1_value",
      "config2": "config2_value"
    }
  }
}
 
response = requests.post(api_url, headers=headers, data=json.dumps(job_config))

Remember to replace your_databricks_token, your_databricks_instance, and other placeholders with your actual values.

Hi @Kaniz Fatma​,

Just wondered if there was any update on this. This is quite an important aspect of how we would implement DLT pipelines so would be good to know if it can be done, or if it's coming.

Many thanks.

labromb
Contributor

Hi @Kaniz Fatma​, thanks for the detailed reply. Looks like the response is talking about a job, not a delta live tables pipeline. Apologies if my initial question was not clear enough...

I am using the Delta Live Tables API:

Delta Live Tables API guide - Azure Databricks | Microsoft Learn

And want to refresh a DLT pipeline... I can initiate a refresh but but need to be able to override the configuration of the DLT pipeline with the values I supply.

I am using Azure Data Factory to call the API so just need to know what the JSON format needs to be in the request body so I can override the parameters

Manjula_Ganesap
Contributor

@labromb - Please let me know if you found a solution to your problem. I'm trying to do the same thing. 

Manjula_Ganesap
Contributor

@Mo - I tried the same thing but i do not see the DLT pipeline config being overriden. The Web activity in ADF has the below config:

Manjula_Ganesap_0-1692886292298.png

and the DLT pipeline config has 

Manjula_Ganesap_1-1692886330784.png

Please let me know what I'm doing wrong. 

 

Manjula_Ganesap
Contributor

@Mo  - Thank you. I will try this out now and keep you posted. I am directly passing the parameters in the body of the Web Activity with the POST command. I will try the update and let you know. 

Thank you so much for the suggestion. 

@Manjula_Ganesap  Can you let us know what you did to get this working? I have a similar use case.

Manjula_Ganesap
Contributor

@Mo - it worked. Thank you so much.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.