<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: use job parameters in scripts in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/82136#M36533</link>
    <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/91166"&gt;@N_M&lt;/a&gt;, I have the same issue.&lt;/P&gt;&lt;P&gt;Have you found a solution to the problem?&lt;/P&gt;</description>
    <pubDate>Wed, 07 Aug 2024 08:18:16 GMT</pubDate>
    <dc:creator>jensi</dc:creator>
    <dc:date>2024-08-07T08:18:16Z</dc:date>
    <item>
      <title>use job parameters in scripts</title>
      <link>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/75296#M34918</link>
      <description>&lt;P&gt;Hi Community&lt;/P&gt;&lt;P&gt;I made some research, but I wasn't lucky, and I'm a bit surprised I can't find anything about it.&lt;/P&gt;&lt;P&gt;So, I would simply access the job parameters when using python scripts (&lt;U&gt;not notebooks&lt;/U&gt;).&lt;/P&gt;&lt;P&gt;My flow doesn't use notebooks, but I still need to drive some parameters that I want to declare before I run the job (so, static).&lt;/P&gt;&lt;P&gt;Here are my attempts so far:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;the trivial use of widgets does not work.&amp;nbsp;Widgets are available only in notebooks, and thus&amp;nbsp;&lt;EM&gt;dbutils.widgets.text/get&lt;/EM&gt; are out of scope. The functions simply return None&lt;/LI&gt;&lt;LI&gt;I tried to move to environment variables, set at runtime, in which a simple notebook, set as root of the flow, push the job parameter to the environment variables, i.e., &lt;EM&gt;os.environ["PARAM"] =&amp;nbsp;dbutils.widgets.text()&lt;/EM&gt;.&lt;BR /&gt;Unfortunately, env variables are *not* propagated to children tasks (probably because interpreters are restarted at each task)&lt;BR /&gt;Limits to workarounds:&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Of course there are 10K workarounds, but some are not applicable to my scope, some are really bad practice. I try to list here my limitations:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;setting environment variables at init scripts does NOT solve my problem. I want a parameter to change before running the job, so same cluster, same flow etc.&lt;/LI&gt;&lt;LI&gt;I want to avoid to create as many job clusters as the parameters, and "pick" the relevant cluster every time i change parameters. Not a good practice in my view&lt;/LI&gt;&lt;LI&gt;I have 20+ scripts, running as DAG flow. They are scripts because they are supposed to run also outside databricks and independently. So I want to avoid the conversion of such scripts to notebooks (this also brings some issues about versioning the code AND the vscode databricks plugin...other topic)&lt;/LI&gt;&lt;LI&gt;I cannot use task parameters. Task parameters depends on the task values, and it doesn't make sense to load a parameter in all my scripts with an hardcoded task (something like&amp;nbsp;&lt;EM&gt;dbutils.jobs.taskValues.get(taskKey = "environment_setter", key = "param", default = 42, debugValue = 0)&lt;/EM&gt;, where "environment_setter" is my root task... )&lt;BR /&gt;Any ideas is really appreciated!&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Fri, 21 Jun 2024 09:04:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/75296#M34918</guid>
      <dc:creator>N_M</dc:creator>
      <dc:date>2024-06-21T09:04:22Z</dc:date>
    </item>
    <item>
      <title>Re: use job parameters in scripts</title>
      <link>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/75341#M34936</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/91166"&gt;@N_M&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;From what I see in the documentation, spark_python_task takes a "parameters" as an array of strings, in which you can put your command line parameters using {{job.parameters.[name]}}&lt;/P&gt;</description>
      <pubDate>Fri, 21 Jun 2024 11:54:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/75341#M34936</guid>
      <dc:creator>daniel_sahal</dc:creator>
      <dc:date>2024-06-21T11:54:09Z</dc:date>
    </item>
    <item>
      <title>Re: use job parameters in scripts</title>
      <link>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/75517#M34974</link>
      <description>&lt;P&gt;A workaround that I found is to use the databricks jobs api to get the job run info. There are job parameters inside, but you need to prepare crendetials in advance.&lt;/P&gt;</description>
      <pubDate>Sun, 23 Jun 2024 18:21:58 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/75517#M34974</guid>
      <dc:creator>xiangzhu</dc:creator>
      <dc:date>2024-06-23T18:21:58Z</dc:date>
    </item>
    <item>
      <title>Re: use job parameters in scripts</title>
      <link>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/81153#M36241</link>
      <description>&lt;P&gt;This is the right answer, here is the doc Daniel is referring to:&amp;nbsp;&lt;A href="https://docs.databricks.com/en/workflows/jobs/parameter-value-references.html#pass-context-about-job-runs-into-job-tasks" target="_blank"&gt;https://docs.databricks.com/en/workflows/jobs/parameter-value-references.html#pass-context-about-job-runs-into-job-tasks&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 30 Jul 2024 14:28:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/81153#M36241</guid>
      <dc:creator>Antoine_B</dc:creator>
      <dc:date>2024-07-30T14:28:44Z</dc:date>
    </item>
    <item>
      <title>Re: use job parameters in scripts</title>
      <link>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/82136#M36533</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/91166"&gt;@N_M&lt;/a&gt;, I have the same issue.&lt;/P&gt;&lt;P&gt;Have you found a solution to the problem?&lt;/P&gt;</description>
      <pubDate>Wed, 07 Aug 2024 08:18:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/82136#M36533</guid>
      <dc:creator>jensi</dc:creator>
      <dc:date>2024-08-07T08:18:16Z</dc:date>
    </item>
    <item>
      <title>Re: use job parameters in scripts</title>
      <link>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/82140#M36534</link>
      <description>&lt;P&gt;there're 2 solutions:&lt;/P&gt;&lt;P&gt;official one:&amp;nbsp;&lt;A href="https://community.databricks.com/t5/data-engineering/retrieve-job-level-parameters-in-python/m-p/82091#M36512" target="_blank"&gt;Re: Retrieve job-level parameters in Python - Databricks Community - 44720&lt;/A&gt;&lt;/P&gt;&lt;P&gt;another one is to use the jobs rest api.&lt;/P&gt;&lt;P&gt;The official one is only available once you enter the argparse part (for some use cases, it might be too late). On the other hand, the rest api can be reaheable from anywhere.&lt;/P&gt;</description>
      <pubDate>Wed, 07 Aug 2024 08:35:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/82140#M36534</guid>
      <dc:creator>xiangzhu</dc:creator>
      <dc:date>2024-08-07T08:35:59Z</dc:date>
    </item>
    <item>
      <title>Re: use job parameters in scripts</title>
      <link>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/82254#M36581</link>
      <description>&lt;P&gt;Thank you! It worked! Used the first (official) solution with asset bundles.&lt;/P&gt;</description>
      <pubDate>Wed, 07 Aug 2024 14:56:50 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/82254#M36581</guid>
      <dc:creator>jensi</dc:creator>
      <dc:date>2024-08-07T14:56:50Z</dc:date>
    </item>
    <item>
      <title>Re: use job parameters in scripts</title>
      <link>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/82261#M36585</link>
      <description>&lt;P&gt;The only working workaround I found has been provided in another thread&lt;BR /&gt;&lt;A href="https://community.databricks.com/t5/data-engineering/retrieve-job-level-parameters-in-python/m-p/78610/highlight/true#M35568" target="_blank"&gt;Re: Retrieve job-level parameters in Python - Databricks Community - 44720&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I will repost it here (thanks&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/111996"&gt;@julio_resende&lt;/a&gt;&amp;nbsp;)&lt;/P&gt;&lt;P&gt;You need to push down your parameters to a task level. Eg:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Create a job level parameter called "my_param"&lt;/LI&gt;&lt;LI&gt;Make a reference to his job parameter in the task level parameters box. Eg:&lt;BR /&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;["--my_param","{{job.parameters.my_param}}"]&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/LI&gt;&lt;LI&gt;Read the task level parameter using argparser in your .py file&lt;/LI&gt;&lt;/OL&gt;</description>
      <pubDate>Wed, 07 Aug 2024 15:57:41 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/use-job-parameters-in-scripts/m-p/82261#M36585</guid>
      <dc:creator>N_M</dc:creator>
      <dc:date>2024-08-07T15:57:41Z</dc:date>
    </item>
  </channel>
</rss>

