<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Access task level parameters along with parameters passed by airflow job in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/access-task-level-parameters-along-with-parameters-passed-by/m-p/129805#M48607</link>
    <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/180515"&gt;@divyab7&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P class=""&gt;Sorry, now I understand better what you actually need. I got confused at first and thought you only wanted to access the parameters you pass through Airflow.&lt;/P&gt;&lt;P class=""&gt;I think the dynamic identifiers that Databricks generates at runtime (like run IDs) are not injected there automatically.&lt;/P&gt;&lt;P class=""&gt;I have been thinking in a way to get them without using &lt;SPAN class=""&gt;dbutils&lt;/SPAN&gt; is:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;&lt;STRONG&gt;Job id &lt;/STRONG&gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;&amp;nbsp;→&amp;nbsp; you can extract it from &lt;/SPAN&gt;spark.conf.get("spark.databricks.clusterUsageTags.clusterName")&lt;SPAN class=""&gt;, which has a value like &lt;/SPAN&gt;job-&amp;lt;job_id&amp;gt;-run-&amp;lt;task_run_id&amp;gt;&lt;SPAN class=""&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;&lt;STRONG&gt;Job run ID&lt;/STRONG&gt;&lt;/SPAN&gt; → once you have the job_id, you can call the &lt;A href="https://docs.databricks.com/api/workspace/jobs/listruns" target="_self"&gt;Databricks Jobs API&amp;nbsp;&lt;/A&gt;and retrieve the&amp;nbsp;&lt;SPAN class=""&gt;job_run_id&lt;/SPAN&gt;.&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;This approach should work, but I agree it’s not very straightforward. Databricks could definitely make it easier to expose these values directly in the runtime context instead of having to parse them or query the API.&lt;BR /&gt;&lt;BR /&gt;Hope this helps, &lt;span class="lia-unicode-emoji" title=":sad_but_relieved_face:"&gt;😥&lt;/span&gt;&lt;BR /&gt;Isi&lt;/P&gt;</description>
    <pubDate>Tue, 26 Aug 2025 10:47:28 GMT</pubDate>
    <dc:creator>Isi</dc:creator>
    <dc:date>2025-08-26T10:47:28Z</dc:date>
    <item>
      <title>Access task level parameters along with parameters passed by airflow job</title>
      <link>https://community.databricks.com/t5/data-engineering/access-task-level-parameters-along-with-parameters-passed-by/m-p/129157#M48453</link>
      <description>&lt;P&gt;I have a airflow DAG which calls databricks job that has a task level parameters defined as job_run_id (job.run_id) and has a type as python_script. When I try to access it using sys.argv and spark_python_task, it only prints the json that has passed through the airflow job. I want that sys should be able to get both the parameters passed by DAG and databricks job.&amp;nbsp;&lt;/P&gt;&lt;P&gt;We have a use case where we don't want to use anything related to dbutils. Its a python script so we want it to be independent of dbutils.&lt;/P&gt;</description>
      <pubDate>Thu, 21 Aug 2025 16:51:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/access-task-level-parameters-along-with-parameters-passed-by/m-p/129157#M48453</guid>
      <dc:creator>divyab7</dc:creator>
      <dc:date>2025-08-21T16:51:28Z</dc:date>
    </item>
    <item>
      <title>Re: Access task level parameters along with parameters passed by airflow job</title>
      <link>https://community.databricks.com/t5/data-engineering/access-task-level-parameters-along-with-parameters-passed-by/m-p/129526#M48551</link>
      <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/180515"&gt;@divyab7&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P class=""&gt;Hi! I ran into the same thing. The short version is: for &lt;SPAN class=""&gt;spark_python_task&lt;/SPAN&gt;, the script only receives the arguments you send in the run payload, and Databricks does &lt;SPAN class=""&gt;&lt;STRONG&gt;not&lt;/STRONG&gt;&lt;/SPAN&gt; automatically merge “job-level” parameters with the ones you pass at run time. What worked for me was to build the job &lt;SPAN class=""&gt;&lt;STRONG&gt;dynamically&lt;/STRONG&gt;&lt;/SPAN&gt; from Airflow: I keep a small YAML (or dict) with the job defaults (cluster type, wheels, and also any default CLI args I want), and then, when the DAG runs, I &lt;SPAN class=""&gt;&lt;STRONG&gt;merge&lt;/STRONG&gt;&lt;/SPAN&gt; those defaults with the DAG’s dynamic values (like &lt;SPAN class=""&gt;data_interval_start&lt;/SPAN&gt; / &lt;SPAN class=""&gt;data_interval_end&lt;/SPAN&gt;). The result is a single, flat list of CLI parameters that I send in the &lt;SPAN class=""&gt;parameters&lt;/SPAN&gt; field of the run request.&lt;/P&gt;&lt;P class=""&gt;This way, inside the Python script I don’t rely on &lt;SPAN class=""&gt;dbutils&lt;/SPAN&gt; at all — I just parse the CLI args and everything is there (both the job defaults and the DAG-specific values). The key point is that run-time parameters &lt;SPAN class=""&gt;&lt;STRONG&gt;replace&lt;/STRONG&gt;&lt;/SPAN&gt; the job’s parameters unless you merge them yourself before submitting the run. This approach keeps the job configurable (cluster/image/wheels can change via config), and at the same time injects all execution info into the script in a simple, dependency-free way.&lt;BR /&gt;&lt;BR /&gt;Tell me if you need more details, &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;BR /&gt;&lt;BR /&gt;Isi&lt;/P&gt;</description>
      <pubDate>Sun, 24 Aug 2025 19:35:38 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/access-task-level-parameters-along-with-parameters-passed-by/m-p/129526#M48551</guid>
      <dc:creator>Isi</dc:creator>
      <dc:date>2025-08-24T19:35:38Z</dc:date>
    </item>
    <item>
      <title>Re: Access task level parameters along with parameters passed by airflow job</title>
      <link>https://community.databricks.com/t5/data-engineering/access-task-level-parameters-along-with-parameters-passed-by/m-p/129529#M48552</link>
      <description>&lt;P&gt;Thank you for your response. Can you please give me an example on how to implement this like should it be implemented in a certain way or do you have any code example?&lt;/P&gt;</description>
      <pubDate>Sun, 24 Aug 2025 20:51:38 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/access-task-level-parameters-along-with-parameters-passed-by/m-p/129529#M48552</guid>
      <dc:creator>divyab7</dc:creator>
      <dc:date>2025-08-24T20:51:38Z</dc:date>
    </item>
    <item>
      <title>Re: Access task level parameters along with parameters passed by airflow job</title>
      <link>https://community.databricks.com/t5/data-engineering/access-task-level-parameters-along-with-parameters-passed-by/m-p/129598#M48564</link>
      <description>&lt;P&gt;My use case is we need job.run_id and we will only get this when the job is triggered and the python script invoked by databricks job needs it in order to move forward. I am still confused even if we merge it then how its going to replace dynamic value reference in databricks. Can you please provide me small code example?&lt;/P&gt;</description>
      <pubDate>Mon, 25 Aug 2025 11:09:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/access-task-level-parameters-along-with-parameters-passed-by/m-p/129598#M48564</guid>
      <dc:creator>divyab7</dc:creator>
      <dc:date>2025-08-25T11:09:54Z</dc:date>
    </item>
    <item>
      <title>Re: Access task level parameters along with parameters passed by airflow job</title>
      <link>https://community.databricks.com/t5/data-engineering/access-task-level-parameters-along-with-parameters-passed-by/m-p/129805#M48607</link>
      <description>&lt;P&gt;Hey&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/180515"&gt;@divyab7&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P class=""&gt;Sorry, now I understand better what you actually need. I got confused at first and thought you only wanted to access the parameters you pass through Airflow.&lt;/P&gt;&lt;P class=""&gt;I think the dynamic identifiers that Databricks generates at runtime (like run IDs) are not injected there automatically.&lt;/P&gt;&lt;P class=""&gt;I have been thinking in a way to get them without using &lt;SPAN class=""&gt;dbutils&lt;/SPAN&gt; is:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;&lt;STRONG&gt;Job id &lt;/STRONG&gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;&amp;nbsp;→&amp;nbsp; you can extract it from &lt;/SPAN&gt;spark.conf.get("spark.databricks.clusterUsageTags.clusterName")&lt;SPAN class=""&gt;, which has a value like &lt;/SPAN&gt;job-&amp;lt;job_id&amp;gt;-run-&amp;lt;task_run_id&amp;gt;&lt;SPAN class=""&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;/LI&gt;&lt;LI&gt;&lt;P class=""&gt;&lt;SPAN class=""&gt;&lt;STRONG&gt;Job run ID&lt;/STRONG&gt;&lt;/SPAN&gt; → once you have the job_id, you can call the &lt;A href="https://docs.databricks.com/api/workspace/jobs/listruns" target="_self"&gt;Databricks Jobs API&amp;nbsp;&lt;/A&gt;and retrieve the&amp;nbsp;&lt;SPAN class=""&gt;job_run_id&lt;/SPAN&gt;.&lt;/P&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;This approach should work, but I agree it’s not very straightforward. Databricks could definitely make it easier to expose these values directly in the runtime context instead of having to parse them or query the API.&lt;BR /&gt;&lt;BR /&gt;Hope this helps, &lt;span class="lia-unicode-emoji" title=":sad_but_relieved_face:"&gt;😥&lt;/span&gt;&lt;BR /&gt;Isi&lt;/P&gt;</description>
      <pubDate>Tue, 26 Aug 2025 10:47:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/access-task-level-parameters-along-with-parameters-passed-by/m-p/129805#M48607</guid>
      <dc:creator>Isi</dc:creator>
      <dc:date>2025-08-26T10:47:28Z</dc:date>
    </item>
    <item>
      <title>Re: Access task level parameters along with parameters passed by airflow job</title>
      <link>https://community.databricks.com/t5/data-engineering/access-task-level-parameters-along-with-parameters-passed-by/m-p/129823#M48613</link>
      <description>&lt;P&gt;This was really helpful. Thank you for the response &lt;span class="lia-unicode-emoji" title=":smiling_face_with_smiling_eyes:"&gt;😊&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Aug 2025 14:15:58 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/access-task-level-parameters-along-with-parameters-passed-by/m-p/129823#M48613</guid>
      <dc:creator>divyab7</dc:creator>
      <dc:date>2025-08-26T14:15:58Z</dc:date>
    </item>
  </channel>
</rss>

