<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Cost estimation before query execution similar to google cloud Big Query equivalent of --dry_run in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/cost-estimation-before-query-execution-similar-to-google-cloud/m-p/99971#M40159</link>
    <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/106889"&gt;@NehaR&lt;/a&gt;,&lt;/P&gt;
&lt;P class="p1"&gt;Currently, Databricks does not have a direct equivalent to BigQuery's --dry_run feature for estimating the cost of a query before execution. However, there are some mechanisms and ongoing projects that aim to provide similar functionality. There is no ETA yet, I will update over here if any update on its implementations.&lt;/P&gt;
&lt;P class="p1"&gt;For now, you can monitor the DBU consumption of your clusters and use historical data to estimate the cost of similar queries. Additionally, you can run smaller versions of your queries to get an idea of their cost and then extrapolate for larger datasets&lt;/P&gt;</description>
    <pubDate>Mon, 25 Nov 2024 15:46:51 GMT</pubDate>
    <dc:creator>Alberto_Umana</dc:creator>
    <dc:date>2024-11-25T15:46:51Z</dc:date>
    <item>
      <title>Cost estimation before query execution similar to google cloud Big Query equivalent of --dry_run</title>
      <link>https://community.databricks.com/t5/data-engineering/cost-estimation-before-query-execution-similar-to-google-cloud/m-p/99438#M39990</link>
      <description>&lt;P&gt;Hi ,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In databricks do we have a option to estimate cost of query before execution which is similar to&amp;nbsp;Big Query equivalent of --dry_run.&lt;/P&gt;&lt;P&gt;Our use case is to estimate cost before execution and get alerted.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Neha&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 20 Nov 2024 00:06:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/cost-estimation-before-query-execution-similar-to-google-cloud/m-p/99438#M39990</guid>
      <dc:creator>NehaR</dc:creator>
      <dc:date>2024-11-20T00:06:45Z</dc:date>
    </item>
    <item>
      <title>Re: Cost estimation before query execution similar to google cloud Big Query equivalent of --dry_run</title>
      <link>https://community.databricks.com/t5/data-engineering/cost-estimation-before-query-execution-similar-to-google-cloud/m-p/99971#M40159</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/106889"&gt;@NehaR&lt;/a&gt;,&lt;/P&gt;
&lt;P class="p1"&gt;Currently, Databricks does not have a direct equivalent to BigQuery's --dry_run feature for estimating the cost of a query before execution. However, there are some mechanisms and ongoing projects that aim to provide similar functionality. There is no ETA yet, I will update over here if any update on its implementations.&lt;/P&gt;
&lt;P class="p1"&gt;For now, you can monitor the DBU consumption of your clusters and use historical data to estimate the cost of similar queries. Additionally, you can run smaller versions of your queries to get an idea of their cost and then extrapolate for larger datasets&lt;/P&gt;</description>
      <pubDate>Mon, 25 Nov 2024 15:46:51 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/cost-estimation-before-query-execution-similar-to-google-cloud/m-p/99971#M40159</guid>
      <dc:creator>Alberto_Umana</dc:creator>
      <dc:date>2024-11-25T15:46:51Z</dc:date>
    </item>
  </channel>
</rss>

