<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to ensure that a Databricks Run Submit run invoked from Airflow only runs one time? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-ensure-that-a-databricks-run-submit-run-invoked-from/m-p/25520#M17760</link>
    <description>&lt;P&gt;Idempotency can be ensured by providing the idempotency token.  It's easy to pass the same through REST API as mentioned in the below doc:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://kb.databricks.com/jobs/jobs-idempotency.html" target="test_blank"&gt;https://kb.databricks.com/jobs/jobs-idempotency.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The primary reason for multiple runs is the client submits the request and waits for the response from the server(Job Service). However, due to one or more reasons, the client does not get a response within its defined timeout period.  After that, the client retries. However the initial request if successfully submitted on the job service will trigger the job run. The retry request will also trigger a job run causing duplicate job runs.  Usage of idempotency token will ensure that the duplicate job runs are not triggered. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 23 Jun 2021 23:17:27 GMT</pubDate>
    <dc:creator>brickster_2018</dc:creator>
    <dc:date>2021-06-23T23:17:27Z</dc:date>
    <item>
      <title>How to ensure that a Databricks Run Submit run invoked from Airflow only runs one time?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-ensure-that-a-databricks-run-submit-run-invoked-from/m-p/25519#M17759</link>
      <description>&lt;P&gt;I am running jobs on Databricks using the Run Submit API with Airflow. I have noticed that rarely, a particular run is run more than one time at once. Why?&lt;/P&gt;</description>
      <pubDate>Tue, 08 Jun 2021 22:35:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-ensure-that-a-databricks-run-submit-run-invoked-from/m-p/25519#M17759</guid>
      <dc:creator>cgrant</dc:creator>
      <dc:date>2021-06-08T22:35:20Z</dc:date>
    </item>
    <item>
      <title>Re: How to ensure that a Databricks Run Submit run invoked from Airflow only runs one time?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-ensure-that-a-databricks-run-submit-run-invoked-from/m-p/25520#M17760</link>
      <description>&lt;P&gt;Idempotency can be ensured by providing the idempotency token.  It's easy to pass the same through REST API as mentioned in the below doc:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://kb.databricks.com/jobs/jobs-idempotency.html" target="test_blank"&gt;https://kb.databricks.com/jobs/jobs-idempotency.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The primary reason for multiple runs is the client submits the request and waits for the response from the server(Job Service). However, due to one or more reasons, the client does not get a response within its defined timeout period.  After that, the client retries. However the initial request if successfully submitted on the job service will trigger the job run. The retry request will also trigger a job run causing duplicate job runs.  Usage of idempotency token will ensure that the duplicate job runs are not triggered. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 23 Jun 2021 23:17:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-ensure-that-a-databricks-run-submit-run-invoked-from/m-p/25520#M17760</guid>
      <dc:creator>brickster_2018</dc:creator>
      <dc:date>2021-06-23T23:17:27Z</dc:date>
    </item>
  </channel>
</rss>

