<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: get job run link based on the job name or the submit body in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/get-job-run-link-based-on-the-job-name-or-the-submit-body/m-p/100124#M40195</link>
    <description>&lt;P&gt;even the rest API also provides the job details based on the job id which I would need to get from the job_name that I have. This seems like the only possible solution since job_id is the true identifier of any workflow job considering we can have multiple jobs with same name.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 26 Nov 2024 15:47:44 GMT</pubDate>
    <dc:creator>ctiwari7</dc:creator>
    <dc:date>2024-11-26T15:47:44Z</dc:date>
    <item>
      <title>get job run link based on the job name or the submit body</title>
      <link>https://community.databricks.com/t5/data-engineering/get-job-run-link-based-on-the-job-name-or-the-submit-body/m-p/89277#M37752</link>
      <description>&lt;P&gt;This is the current code(ignore indentations) that I am using which takes the list of all the running jobs and then filters from the list to get the run id of the matching job name. I want to know if there is any better way to optimise this.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Legacy databricks cli being used, 0.17.8&lt;/P&gt;&lt;P&gt;&lt;EM&gt;cmd = ["databricks", "runs", "list", "--output", "json"]&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;output = subprocess.run(cmd, capture_output=True) # noqa: S607,S603&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;stdout = output.stdout.decode("utf-8")&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;runs = json.loads(stdout)&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;run_name = submit_body["run_name"]&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;spark_python_task = submit_body["spark_python_task"]&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;matching_run = None&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;for _run in runs["runs"]:&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;if _run["run_name"] == run_name and _run["task"]["spark_python_task"] == spark_python_task:&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;matching_run = _run&lt;/EM&gt;&lt;BR /&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;break&lt;/EM&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Sep 2024 11:28:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/get-job-run-link-based-on-the-job-name-or-the-submit-body/m-p/89277#M37752</guid>
      <dc:creator>ctiwari7</dc:creator>
      <dc:date>2024-09-10T11:28:32Z</dc:date>
    </item>
    <item>
      <title>Re: get job run link based on the job name or the submit body</title>
      <link>https://community.databricks.com/t5/data-engineering/get-job-run-link-based-on-the-job-name-or-the-submit-body/m-p/89284#M37754</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/120100"&gt;@ctiwari7&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;I don't know if this is a better approach, because it's a very subjective matter, but you can try to use 2 alternative approaches:&lt;/P&gt;&lt;P&gt;1.&amp;nbsp;system tables - &amp;gt;&amp;nbsp;&amp;nbsp;&lt;A href="https://docs.databricks.com/en/admin/system-tables/jobs.html#job-run-timeline-table-schema" target="_blank" rel="noopener"&gt;Jobs system table reference | Databricks on AWS&lt;/A&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;2.&amp;nbsp;&amp;nbsp;REST API calls to first:&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;- get a list of all job names and their respective ids using list jobs REST API endpoint&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;A href="https://docs.databricks.com/api/azure/workspace/jobs/list" target="_blank" rel="noopener"&gt;List jobs | Jobs API | REST API reference | Azure Databricks&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; - use the job runs endpoint to get active job runs with all required information. Then you can associate job_run&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;with job_name using job_id atribute&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;A href="https://docs.databricks.com/api/azure/workspace/jobs/listruns" target="_blank" rel="noopener"&gt;List job runs | Jobs API | REST API reference | Azure Databricks&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Sep 2024 12:03:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/get-job-run-link-based-on-the-job-name-or-the-submit-body/m-p/89284#M37754</guid>
      <dc:creator>szymon_dybczak</dc:creator>
      <dc:date>2024-09-10T12:03:29Z</dc:date>
    </item>
    <item>
      <title>Re: get job run link based on the job name or the submit body</title>
      <link>https://community.databricks.com/t5/data-engineering/get-job-run-link-based-on-the-job-name-or-the-submit-body/m-p/100124#M40195</link>
      <description>&lt;P&gt;even the rest API also provides the job details based on the job id which I would need to get from the job_name that I have. This seems like the only possible solution since job_id is the true identifier of any workflow job considering we can have multiple jobs with same name.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Nov 2024 15:47:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/get-job-run-link-based-on-the-job-name-or-the-submit-body/m-p/100124#M40195</guid>
      <dc:creator>ctiwari7</dc:creator>
      <dc:date>2024-11-26T15:47:44Z</dc:date>
    </item>
  </channel>
</rss>

