<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to actually get job_id and run_id in a Databricks Python wheel task (Avoid Hallucinations) in Community Articles</title>
    <link>https://community.databricks.com/t5/community-articles/how-to-actually-get-job-id-and-run-id-in-a-databricks-python/m-p/152345#M1128</link>
    <description>&lt;P&gt;We needed job_id and run_id in a custom metrics Delta table so we could join to `system.lakeflow.job_run_timeline`. Tried four approaches before finding the one that works on serverless compute.&lt;/P&gt;&lt;H1&gt;What doesn't work&lt;/H1&gt;&lt;P&gt;&lt;EM&gt;spark.conf.get("spark.databricks.job.id")&lt;/EM&gt;&lt;BR /&gt;Throws CONFIG_NOT_AVAILABLE on serverless. This key exists in classic compute but not in the Spark Connect protocol.&lt;/P&gt;&lt;P&gt;&lt;EM&gt;os.environ["DATABRICKS_JOB_ID"]&lt;/EM&gt;&lt;BR /&gt;Not a real env var. Databricks sets `DATABRICKS_RUNTIME_VERSION` and cluster lib paths, but nothing with job identity.&lt;/P&gt;&lt;P&gt;&lt;EM&gt;dbutils.notebook.entry_point.getDbutils().notebook().getContext()&lt;/EM&gt;&lt;BR /&gt;Works on notebook tasks. Fails in Python wheel tasks with the module has no attribute 'notebook'.&lt;/P&gt;&lt;P&gt;&lt;EM&gt;spark_env_vars with {{job.id}}&lt;/EM&gt;&lt;BR /&gt;Dynamic value references don't resolve in spark_env_vars. The value passes through as the literal string {{job.id}}.&lt;/P&gt;&lt;H1&gt;What works&lt;/H1&gt;&lt;P&gt;Job-level parameters with dynamic value references, piped into task named_parameters:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;parameters:
- name: job_id
default: "{{job.id}}"
- name: run_id
default: "{{job.run_id}}"

tasks:
- python_wheel_task:
named_parameters:
job_id: "{{job.parameters.job_id}}"
run_id: "{{job.parameters.run_id}}"&lt;/LI-CODE&gt;&lt;P&gt;Values arrive as &lt;EM&gt;sys.argv&lt;/EM&gt;. Parse with argparse:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;import argparse, sys

parser = argparse.ArgumentParser()
parser.add_argument("--job_id", type=int, default=None)
parser.add_argument("--run_id", type=int, default=None)
args, _ = parser.parse_known_args(sys.argv[1:])&lt;/LI-CODE&gt;&lt;H1&gt;Bonus: dbruntime.databricks_repl_context also works&lt;/H1&gt;&lt;LI-CODE lang="python"&gt;from dbruntime.databricks_repl_context import get_context
ctx = get_context()
job_id = ctx.jobId
run_id = ctx.idInJob&lt;/LI-CODE&gt;&lt;P&gt;Undocumented but functional in both script and wheel tasks on serverless. We went with `named_parameters` because it's the documented approach.&lt;/P&gt;&lt;H1&gt;How I figured this out&lt;/H1&gt;&lt;P&gt;Wrote a 30-line test script that dumps sys.argv, all env vars, spark conf, and dbutils context. Created a Databricks job with job parameters set to {{job.id}} and {{job.run_id}}. Ran it once. The output showed exactly which sources had real values and which were empty.&lt;/P&gt;&lt;P&gt;Sometimes the fastest path to the answer is the oldest trick: print everything, read the output.&lt;/P&gt;&lt;P&gt;Full blog post with the story behind these findings: &lt;A href="https://medium.com/@kirankbs/four-hallucinations-and-a-python-script-6cc3da4f57b6" target="_self"&gt;link&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 27 Mar 2026 19:34:54 GMT</pubDate>
    <dc:creator>Kirankumarbs</dc:creator>
    <dc:date>2026-03-27T19:34:54Z</dc:date>
    <item>
      <title>How to actually get job_id and run_id in a Databricks Python wheel task (Avoid Hallucinations)</title>
      <link>https://community.databricks.com/t5/community-articles/how-to-actually-get-job-id-and-run-id-in-a-databricks-python/m-p/152345#M1128</link>
      <description>&lt;P&gt;We needed job_id and run_id in a custom metrics Delta table so we could join to `system.lakeflow.job_run_timeline`. Tried four approaches before finding the one that works on serverless compute.&lt;/P&gt;&lt;H1&gt;What doesn't work&lt;/H1&gt;&lt;P&gt;&lt;EM&gt;spark.conf.get("spark.databricks.job.id")&lt;/EM&gt;&lt;BR /&gt;Throws CONFIG_NOT_AVAILABLE on serverless. This key exists in classic compute but not in the Spark Connect protocol.&lt;/P&gt;&lt;P&gt;&lt;EM&gt;os.environ["DATABRICKS_JOB_ID"]&lt;/EM&gt;&lt;BR /&gt;Not a real env var. Databricks sets `DATABRICKS_RUNTIME_VERSION` and cluster lib paths, but nothing with job identity.&lt;/P&gt;&lt;P&gt;&lt;EM&gt;dbutils.notebook.entry_point.getDbutils().notebook().getContext()&lt;/EM&gt;&lt;BR /&gt;Works on notebook tasks. Fails in Python wheel tasks with the module has no attribute 'notebook'.&lt;/P&gt;&lt;P&gt;&lt;EM&gt;spark_env_vars with {{job.id}}&lt;/EM&gt;&lt;BR /&gt;Dynamic value references don't resolve in spark_env_vars. The value passes through as the literal string {{job.id}}.&lt;/P&gt;&lt;H1&gt;What works&lt;/H1&gt;&lt;P&gt;Job-level parameters with dynamic value references, piped into task named_parameters:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;parameters:
- name: job_id
default: "{{job.id}}"
- name: run_id
default: "{{job.run_id}}"

tasks:
- python_wheel_task:
named_parameters:
job_id: "{{job.parameters.job_id}}"
run_id: "{{job.parameters.run_id}}"&lt;/LI-CODE&gt;&lt;P&gt;Values arrive as &lt;EM&gt;sys.argv&lt;/EM&gt;. Parse with argparse:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;import argparse, sys

parser = argparse.ArgumentParser()
parser.add_argument("--job_id", type=int, default=None)
parser.add_argument("--run_id", type=int, default=None)
args, _ = parser.parse_known_args(sys.argv[1:])&lt;/LI-CODE&gt;&lt;H1&gt;Bonus: dbruntime.databricks_repl_context also works&lt;/H1&gt;&lt;LI-CODE lang="python"&gt;from dbruntime.databricks_repl_context import get_context
ctx = get_context()
job_id = ctx.jobId
run_id = ctx.idInJob&lt;/LI-CODE&gt;&lt;P&gt;Undocumented but functional in both script and wheel tasks on serverless. We went with `named_parameters` because it's the documented approach.&lt;/P&gt;&lt;H1&gt;How I figured this out&lt;/H1&gt;&lt;P&gt;Wrote a 30-line test script that dumps sys.argv, all env vars, spark conf, and dbutils context. Created a Databricks job with job parameters set to {{job.id}} and {{job.run_id}}. Ran it once. The output showed exactly which sources had real values and which were empty.&lt;/P&gt;&lt;P&gt;Sometimes the fastest path to the answer is the oldest trick: print everything, read the output.&lt;/P&gt;&lt;P&gt;Full blog post with the story behind these findings: &lt;A href="https://medium.com/@kirankbs/four-hallucinations-and-a-python-script-6cc3da4f57b6" target="_self"&gt;link&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 27 Mar 2026 19:34:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/community-articles/how-to-actually-get-job-id-and-run-id-in-a-databricks-python/m-p/152345#M1128</guid>
      <dc:creator>Kirankumarbs</dc:creator>
      <dc:date>2026-03-27T19:34:54Z</dc:date>
    </item>
  </channel>
</rss>

