<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Configure jobs throttling for ephemeral cluster ETLs in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32432#M23632</link>
    <description>&lt;P&gt;If you have a fixed size cluster, this will happen automatically.  Just don't turn on autoscaling.&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/clusters/configure.html#cluster-size-and-autoscaling" target="test_blank"&gt;https://docs.databricks.com/clusters/configure.html#cluster-size-and-autoscaling&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 31 Dec 2021 00:18:02 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2021-12-31T00:18:02Z</dc:date>
    <item>
      <title>Configure jobs throttling for ephemeral cluster ETLs</title>
      <link>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32430#M23630</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Is it possible to configure job throttling in order to queue jobs across a workspace after a given number of concurrent execution when using the ephemeral cluster pattern? The reason is mainly for cost control. We prefer reducing performance rather than increasing cost if too many jobs are executed for various reasons.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Thu, 30 Dec 2021 14:28:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32430#M23630</guid>
      <dc:creator>RicksDB</dc:creator>
      <dc:date>2021-12-30T14:28:44Z</dc:date>
    </item>
    <item>
      <title>Re: Configure jobs throttling for ephemeral cluster ETLs</title>
      <link>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32431#M23631</link>
      <description>&lt;P&gt;Hello @E H​&amp;nbsp;- Welcome and thank you for asking. My name is Piper, and I'm a moderator for Databricks. Let's see what the other members have to say. If we don't hear anything, we'll swing back to this.&lt;/P&gt;</description>
      <pubDate>Thu, 30 Dec 2021 17:28:11 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32431#M23631</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2021-12-30T17:28:11Z</dc:date>
    </item>
    <item>
      <title>Re: Configure jobs throttling for ephemeral cluster ETLs</title>
      <link>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32432#M23632</link>
      <description>&lt;P&gt;If you have a fixed size cluster, this will happen automatically.  Just don't turn on autoscaling.&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/clusters/configure.html#cluster-size-and-autoscaling" target="test_blank"&gt;https://docs.databricks.com/clusters/configure.html#cluster-size-and-autoscaling&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 31 Dec 2021 00:18:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32432#M23632</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2021-12-31T00:18:02Z</dc:date>
    </item>
    <item>
      <title>Re: Configure jobs throttling for ephemeral cluster ETLs</title>
      <link>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32433#M23633</link>
      <description>&lt;P&gt;Thanks for the answer josephk.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;However, this solution doesn't work in my case.&lt;/P&gt;&lt;P&gt;If I launch 20 different jobs, I will have 20 ephemeral clusters running at the same time. Hence, if they each run for 5 mins, we will incur a bill of 100 min execution.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The idea would be, for example, to have a maximum configurable execution time for each 30 min. Jobs would be queued afterwards.&amp;nbsp;In my example, I could just take my configured execution time * 48 and it would give me the worst possible case in a single day.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I also tried a scenario using a pool with a maximum set of VMs. However, instead of queuing jobs, additional jobs failed since new VMs couldn't be provisioned.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The objective is to be able to predict cost by calculating the worst case in a single day and ensure that we don't go beyond that. Right now, the only way I can do this is using an interactive cluster (and paying more DBU) instead of a job cluster.&lt;/P&gt;</description>
      <pubDate>Fri, 31 Dec 2021 00:34:51 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32433#M23633</guid>
      <dc:creator>RicksDB</dc:creator>
      <dc:date>2021-12-31T00:34:51Z</dc:date>
    </item>
    <item>
      <title>Re: Configure jobs throttling for ephemeral cluster ETLs</title>
      <link>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32434#M23634</link>
      <description>&lt;P&gt;Why not just run all the jobs on the same cluster?  That will save you a lot of time not starting up 20 clusters.&lt;/P&gt;</description>
      <pubDate>Fri, 31 Dec 2021 00:38:50 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32434#M23634</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2021-12-31T00:38:50Z</dc:date>
    </item>
    <item>
      <title>Re: Configure jobs throttling for ephemeral cluster ETLs</title>
      <link>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32435#M23635</link>
      <description>&lt;P&gt;That is only possible with an interactive cluster (cost more DBU). At least as far as I know.&lt;/P&gt;</description>
      <pubDate>Fri, 31 Dec 2021 00:42:00 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32435#M23635</guid>
      <dc:creator>RicksDB</dc:creator>
      <dc:date>2021-12-31T00:42:00Z</dc:date>
    </item>
    <item>
      <title>Re: Configure jobs throttling for ephemeral cluster ETLs</title>
      <link>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32436#M23636</link>
      <description>&lt;P&gt;Yes, that's correct.  There is a new feature in the roadmap to reuse the same cluster which should help/speed things up.  &lt;/P&gt;&lt;P&gt;Might still be worth it to do it all on 1 interactive cluster, which again shouldn't be too expensive for a smaller cluster with single node.  &lt;/P&gt;</description>
      <pubDate>Fri, 31 Dec 2021 00:49:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32436#M23636</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2021-12-31T00:49:15Z</dc:date>
    </item>
    <item>
      <title>Re: Configure jobs throttling for ephemeral cluster ETLs</title>
      <link>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32437#M23637</link>
      <description>&lt;P&gt;Thanks for the help josephk. I will continue to use an interactive cluster for the time being until the release of that new feature. Hopefully, it will allow my use case. Is there visibility on the roadmap for an ETA or more information on it?&lt;/P&gt;</description>
      <pubDate>Fri, 31 Dec 2021 01:03:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32437#M23637</guid>
      <dc:creator>RicksDB</dc:creator>
      <dc:date>2021-12-31T01:03:27Z</dc:date>
    </item>
    <item>
      <title>Re: Configure jobs throttling for ephemeral cluster ETLs</title>
      <link>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32438#M23638</link>
      <description>&lt;P&gt;The image is from the roadmap that was released in November, so it should be in preview sometime this month if it isn't already.  Talk to your CSE about the preview testing.&lt;/P&gt;</description>
      <pubDate>Sun, 02 Jan 2022 00:00:16 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32438#M23638</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-01-02T00:00:16Z</dc:date>
    </item>
    <item>
      <title>Re: Configure jobs throttling for ephemeral cluster ETLs</title>
      <link>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32439#M23639</link>
      <description>&lt;P&gt;@E H​&amp;nbsp;that feature is in preview! Hit me up at bilal dot aslam at databricks dot com and I will get you enrolled in it.&lt;/P&gt;</description>
      <pubDate>Mon, 03 Jan 2022 14:50:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/configure-jobs-throttling-for-ephemeral-cluster-etls/m-p/32439#M23639</guid>
      <dc:creator>BilalAslamDbrx</dc:creator>
      <dc:date>2022-01-03T14:50:10Z</dc:date>
    </item>
  </channel>
</rss>

