<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Exact cost for job execution calculation in Administration &amp; Architecture</title>
    <link>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/94055#M2061</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/104480"&gt;@radothede&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;That's what I'm trying to get into. You used 0.159 + 0.392 = 0.551 DBU, but that is only the compute power. So technically it is just a snapshot of how much resources you used in the background. But the important information for calculating the cost is actually the time (e.g. 72 seconds) you used these DBUs for. So DBU is not a metric to calculate cost, it's DBU-hour&lt;/P&gt;</description>
    <pubDate>Tue, 15 Oct 2024 09:26:36 GMT</pubDate>
    <dc:creator>jreh</dc:creator>
    <dc:date>2024-10-15T09:26:36Z</dc:date>
    <item>
      <title>Exact cost for job execution calculation</title>
      <link>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/93614#M2041</link>
      <description>&lt;P&gt;Hi everybody,&lt;/P&gt;&lt;P&gt;I want to calculate the exact cost of single job execution. In all examples I can find on the internet it uses the tables&amp;nbsp;&lt;SPAN&gt;system.billing.usage and&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;system.billing.list_prices. It makes sense to calculate the sum of DBUs consumed and multiply it by the current price for that SKU. What confuses me is that the usage_unity used there is DBU but the prices are for DBUs used per hour. If every time window found in&amp;nbsp;&lt;SPAN&gt;system.billing.usage would be 1 hour, it would all allign, but what I find are also time windows of 10 minutes. If it is like shown in the screenshot with a 10 minute time window, wouldn't I need to first divide the usage_quantity by 6 as it is only used one sixth of an hour? And then multiply that by the prices in the price list?&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&lt;SPAN&gt;Also can someone explain what the single rows are? Are these different compute instances used for the job exceution?&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="jreh_0-1728643622955.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/11839i6672C36A3A373FA6/image-size/medium?v=v2&amp;amp;px=400" role="button" title="jreh_0-1728643622955.png" alt="jreh_0-1728643622955.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 11 Oct 2024 10:52:21 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/93614#M2041</guid>
      <dc:creator>jreh</dc:creator>
      <dc:date>2024-10-11T10:52:21Z</dc:date>
    </item>
    <item>
      <title>Re: Exact cost for job execution calculation</title>
      <link>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/93689#M2046</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/126209"&gt;@jreh&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;You are right about calculating the single job execution using&amp;nbsp;&lt;SPAN&gt;system.billing.usage and&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;system.billing.list_prices.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;The &lt;STRONG&gt;DBU consumption&lt;/STRONG&gt; reported in shorter windows (e.g., 10 minutes) &lt;STRONG&gt;is already reflective of the time window&lt;/STRONG&gt;, so You dont need to divide it by 6.&lt;/P&gt;&lt;P&gt;So in general, what You want to do is to sum the calculation for eac job run.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;sum(usage_quantity * list_prices.pricing.default)&lt;/LI-CODE&gt;&lt;P&gt;You can find more useful examples here:&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/en/admin/system-tables/jobs-cost.html" target="_blank"&gt;https://docs.databricks.com/en/admin/system-tables/jobs-cost.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Regarding&amp;nbsp;&lt;SPAN&gt;system.billing.usage system table, You can find out more here:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/en/admin/system-tables/billing.html" target="_blank"&gt;https://docs.databricks.com/en/admin/system-tables/billing.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;A single row in the system.billing.usage table represents a snapshot of DBU usage for a specific job run within time window and for specific SKU.&lt;/P&gt;</description>
      <pubDate>Sun, 13 Oct 2024 09:07:41 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/93689#M2046</guid>
      <dc:creator>radothede</dc:creator>
      <dc:date>2024-10-13T09:07:41Z</dc:date>
    </item>
    <item>
      <title>Re: Exact cost for job execution calculation</title>
      <link>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/93768#M2050</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/104480"&gt;@radothede&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;thank you for your answer.&lt;/P&gt;&lt;P&gt;But I just want to clarify so everything is correct:&lt;BR /&gt;Isn't the metric that is used for cost calculation &lt;STRONG&gt;DBU-hour&lt;/STRONG&gt;? And the metric for (let's keep it simple) compute power is &lt;STRONG&gt;DBU&lt;/STRONG&gt;. So yes, the DBU consumption is correct in the table but the unit for the cost is still &lt;STRONG&gt;DBU-hour&lt;/STRONG&gt;. I pay a specific price if I use a cluster with that compute size for 1 hour.&lt;/P&gt;&lt;P&gt;Normally, all the time windows in the usage_quantity table are exactly 1 hour, except for &lt;STRONG&gt;serverless&lt;/STRONG&gt; compute. Here we have 10 minutes. If I use the proposed formula without dividing it for the fraction of the used hour, my serverless jobs are 5-6 times more expensive, than running it on a regular job cluster. Which does not seem right.&lt;/P&gt;&lt;P&gt;In my mind that translates to the exact same way the cost calculation for electricity works. If I have a device that uses 1000W (1kW) and I use that device for 1 hour, I just used 1kWh and my price is fixed for kWh. I don't get charged for the lone reason of using a device with 1kW. I get charged for the time that device is used. But if I use that device for only 30 minutes, I used 1kW for half an hour, so 0.5kWh.&lt;/P&gt;&lt;P&gt;Please let me know if I'm wrong with that assumption&lt;/P&gt;</description>
      <pubDate>Mon, 14 Oct 2024 06:38:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/93768#M2050</guid>
      <dc:creator>jreh</dc:creator>
      <dc:date>2024-10-14T06:38:19Z</dc:date>
    </item>
    <item>
      <title>Re: Exact cost for job execution calculation</title>
      <link>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/93799#M2051</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/126209"&gt;@jreh&lt;/a&gt;&amp;nbsp;which specific serverless sku are you willing to check?&lt;/P&gt;&lt;P&gt;Can You share the query please?&lt;/P&gt;&lt;P&gt;I've double-checked that, and I'm sure the usage quantity, at least in my case, reflects the time window—so it's 10 minutes of consumption, and I do not need to divide it by 6.&lt;/P&gt;&lt;P&gt;I'm using that query to produce the output:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;select sku_name, 
       usage_start_time, 
       usage_end_time, 
       usage_quantity,
       usage_unit
from system.billing.usage 
where sku_name like '%SERVERLESS%'
and USAGE_END_TIME - USAGE_START_TIME = INTERVAL '0 00:10:00' DAY TO SECOND
group by all;&lt;/LI-CODE&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="radothede_2-1728895831947.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/11887iF435C654DE995723/image-size/medium?v=v2&amp;amp;px=400" role="button" title="radothede_2-1728895831947.png" alt="radothede_2-1728895831947.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 14 Oct 2024 08:50:58 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/93799#M2051</guid>
      <dc:creator>radothede</dc:creator>
      <dc:date>2024-10-14T08:50:58Z</dc:date>
    </item>
    <item>
      <title>Re: Exact cost for job execution calculation</title>
      <link>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/94034#M2059</link>
      <description>&lt;P&gt;Actually, I've checked the table details based on one job run example, and this table looks messy...&lt;/P&gt;&lt;P&gt;This job was triggerd on serverless job cluster and succeded in 1 minute and 12 seconds.&lt;/P&gt;&lt;P&gt;In system.billing.usage I can see 3 different rows for this job run:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="radothede_1-1728982792984.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/11932i2C90FF8F6B9191D4/image-size/medium?v=v2&amp;amp;px=400" role="button" title="radothede_1-1728982792984.png" alt="radothede_1-1728982792984.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I think it is not possible that job used 0.159 + 0.392 DBU in 72 seconds, is it ?&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Could anyone from databricks confirm how the logic actually works so we wont need to guess?&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 15 Oct 2024 09:00:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/94034#M2059</guid>
      <dc:creator>radothede</dc:creator>
      <dc:date>2024-10-15T09:00:05Z</dc:date>
    </item>
    <item>
      <title>Re: Exact cost for job execution calculation</title>
      <link>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/94055#M2061</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/104480"&gt;@radothede&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;That's what I'm trying to get into. You used 0.159 + 0.392 = 0.551 DBU, but that is only the compute power. So technically it is just a snapshot of how much resources you used in the background. But the important information for calculating the cost is actually the time (e.g. 72 seconds) you used these DBUs for. So DBU is not a metric to calculate cost, it's DBU-hour&lt;/P&gt;</description>
      <pubDate>Tue, 15 Oct 2024 09:26:36 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/94055#M2061</guid>
      <dc:creator>jreh</dc:creator>
      <dc:date>2024-10-15T09:26:36Z</dc:date>
    </item>
    <item>
      <title>Re: Exact cost for job execution calculation</title>
      <link>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/95123#M2128</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/104480"&gt;@radothede&lt;/a&gt;, I've clarified this with Databricks and my assumption was correct. The formula&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;sum(usage_quantity * list_prices.pricing.default)&lt;/LI-CODE&gt;&lt;P&gt;is only right, if the time window in the usage table is 1 hour. For every window that is not 1 hour, the fraction of the used hour needs to be calculated and multiplied with the usage_quantity. So something like this might be the correct formula:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;select ((unix_timestamp(usage_end_time) - unix_timestamp(usage_start_time))/3600) * usage_quantity as DBU_hours, * from system.billing.usage&lt;/LI-CODE&gt;</description>
      <pubDate>Mon, 21 Oct 2024 06:58:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/95123#M2128</guid>
      <dc:creator>jreh</dc:creator>
      <dc:date>2024-10-21T06:58:52Z</dc:date>
    </item>
    <item>
      <title>Re: Exact cost for job execution calculation</title>
      <link>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/116307#M3280</link>
      <description>&lt;P&gt;And what about the costs for the disks of the VMs of the cluster?&lt;/P&gt;</description>
      <pubDate>Wed, 23 Apr 2025 10:23:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/administration-architecture/exact-cost-for-job-execution-calculation/m-p/116307#M3280</guid>
      <dc:creator>vziog</dc:creator>
      <dc:date>2025-04-23T10:23:57Z</dc:date>
    </item>
  </channel>
</rss>

