How to detect if running in a workflow job?
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-03-2024 08:54 AM
Hi there,
what's the best way to differentiate in what environment my Spark session is running? Locally I develop with databricks-connect's DatabricksSession, but that doesn't work when running a workflow job which requires SparkSession.getOrCreate(). Right now in the job I'm passing a parameter that the app is reading. Is there another robust way to detected if the app is running on a Databricks cluster or not?
2 REPLIES 2
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-14-2024 11:21 AM
Thanks, dbutils.notebook.getContext does indeed contain information about the job run.
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-21-2025 08:08 PM
import json
def get_job_context():
"""Retrieve job-related context from the current Databricks notebook."""
# Retrieve the notebook context
ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext()
# Convert the context to a JSON string
ctx_json = ctx.toJson()
# Parse the JSON string into a Python dictionary
ctx_dict = json.loads(ctx_json)
# Access the 'tags' dictionary
tags_dict = ctx_dict.get('tags', {})
# Filter for keys containing 'job' or 'run'
job_context = {k: v for k, v in tags_dict.items() if 'job' in k.lower() or 'run' in k.lower()}
return job_context
def is_running_in_databricks_workflow():
"""Detect if running inside a Databricks Workflow job."""
job_context = get_job_context()
return 'jobName' in job_context
# Example usage
print(f"Is running in Databricks Workflow: {is_running_in_databricks_workflow()}")
Robert Altmiller

