Databricks Tasks Python wheel : How access to JobID & runID ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ04-28-2023 11:45 AM
I'm using Python (as Python wheel application) on Databricks.
I deploy & run my jobs using dbx.
I defined some Databricks Workflow using Python wheel tasks.
Everything is working fine, but I'm having issue to extract "databricks_job_id" & "databricks_run_id" for logging/monitoring purpose.
I'm used to defined {{job_id}} & {{run_id}} as parameter in "Notebook Task" or other task type, its works fine.
But with Python wheel I'm not able to define theses :
With Python wheel task, parameters are basically an array of string :
["/dbfs/Shared/dbx/projects/myproject/66655665aac24e748d4e7b28c6f4d624/artifacts/myparameter.yml","/dbfs/Shared/dbx/projects/myproject/66655665aac24e748d4e7b28c6f4d624/artifacts/conf"]
Adding "{{job_id}}" & "{{run_id}}" in this array doesn't seems to work ...
Do you have any ideas ? Don't want to use any REST API during my workload just to extract theses ids...
I guess that I cannot use dbutils / notebook context to got thoses IDs since I don't use any notebooks ...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ05-13-2023 09:53 AM
@Grรฉgoire PORTIERโ :
You can use the dbutils module to retrieve the job ID and run ID from within your Python wheel application. Here's an example of how you can do this:
from pyspark.sql import SparkSession
import requests
import json
import os
# Get the current SparkSession
spark = SparkSession.builder.getOrCreate()
# Get the Databricks job ID and run ID from the environment variables
job_id = os.environ.get("DATABRICKS_JOB_ID")
run_id = os.environ.get("DATABRICKS_RUN_ID")
# Print the job ID and run ID for logging/monitoring purposes
print(f"Databricks Job ID: {job_id}")
print(f"Databricks Run ID: {run_id}")
You can then add this code to your Python wheel task to extract the job ID and run ID and use them for logging/monitoring purposes.
Note that the environment variables DATABRICKS_JOB_ID and DATABRICKS_RUN_ID are automatically set by Databricks when you run a job, so you don't need to pass them as parameters.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ05-23-2023 06:32 AM
Hey Suteja,
Thank you for your response, but unfortunalty, it doesn't work with environment variables.
Got null value for both variable ?
Do you have any idea which DBR should I use ?
Or any documentation about this environment variables ?
Thank you
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ03-06-2024 08:54 AM
There you can see a complete template project with Databricks Asset Bundles and python wheel task. Please, follow the instructions for deployment.
https://github.com/andre-salvati/databricks-template
In particular, take a look at the workflow definition here.

