- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-01-2025 06:16 AM
The task-level parameter is not very useful, in my opinion, because it is hardcoded and not a real parameter. In such cases, I often use a config.py file to define all task level parameters directly within Python as a configuration.
However, job-level parameters are really useful since we can change their values dynamically when manually triggering a job run.
I'm not aware if Databricks' dbutils provides a built-in method to directly return the current job parameters, but we can work around this by querying the Jobs API.
At the very beginning of the task's Python code:
- Use dbutils to get the current job run ID.
- Use the Jobs API to retrieve the current job info using the job run ID. (whether from raw api call, or from the databricks_cli sdk, not the new go lib, but the old python one)
- Extract the job_parameters from the returned job info.
- Save all the job_parameters as environment variables.
From anywhere in your code, you can now access the current job parameters from the environment variables without using argparse.
For steps 1–4, you could write a function to encapsulate the entire process.