Re: Retrieve job-level parameters in Python

xiangzhu · ‎02-01-2025

The task-level parameter is not very useful, in my opinion, because it is hardcoded and not a real parameter. In such cases, I often use a config.py file to define all task level parameters directly within Python as a configuration.

However, job-level parameters are really useful since we can change their values dynamically when manually triggering a job run.

I'm not aware if Databricks' dbutils provides a built-in method to directly return the current job parameters, but we can work around this by querying the Jobs API.

At the very beginning of the task's Python code:

Use dbutils to get the current job run ID.
Use the Jobs API to retrieve the current job info using the job run ID. (whether from raw api call, or from the databricks_cli sdk, not the new go lib, but the old python one)
Extract the job_parameters from the returned job info.
Save all the job_parameters as environment variables.

From anywhere in your code, you can now access the current job parameters from the environment variables without using argparse.

For steps 1–4, you could write a function to encapsulate the entire process.