cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks - Cost difference between Job Clusters and DLT

smurug24
New Contributor

Wanted to know about the cost comparison and certain specific feature details between job clusters and DLT,

Per the Pricing page (based on both Azure Pricing and Databricks Pricing page) following is the understanding - Region: US East

Provisioned

  • Jobs Compute - $0.30/DBU-hour + vm cost 
  • DLT Core - $0.30/DBU-hour + vm cost; DLT Pro - $0.38/DBU-hour + vm cost; DLT Advanced - $0.54/DBU-hour + vm cost 

Serverless (Without the limited time promotion which is valid till 31/Oct)

  • Jobs Compute - $0.45/DBU-hour
  • DLT - $0.45/DBU-hour

Clarifications

  • DLT Serverless does not have separate pricing for different variants - looks all the features like CDC and DQ rules are packaged and available in the same version - is it correct?
  • With the above costing, does this mean that notebooks executed via workflows with Job serverless compute will cost the same as the DLT serverless?
  • In DLT Provisioned, there will be two clusters - updates (for performing the actual data processing) and maintenance (for performing the maintenance operations). So in case of DLT serverless as well, will it be internally running two different clusters - which will invariably result in higher cost to job clusters. Basically wanted to understand that whether DLT serverless will consume additional DBUs due to maintenance or other management overheads when compared to Jobs compute - serverless?
2 REPLIES 2

thomas-totter
New Contributor III

@smurug24 I don't quite get the reason why you are comparing DLT and Job computes. A DLT compute is only usable in a Delta Live Tables pipeline, which means you are dealing with a streaming use case. The only way to implement such a workload in a Job, would be via Spark Structured Streaming. If you are looking for a decision between Structured Streaming and DLT, then there are a lot of other (and in my opinion more relevant) things to consider  than compute costs.

If you start/orchestrate a DLT pipeline from a Job/Workflow it will still have to use a DLT compute and not a job compute.

thomas-totter
New Contributor III

@smurug24 wrote:
  • In DLT Provisioned, there will be two clusters - updates (for performing the actual data processing) and maintenance (for performing the maintenance operations). So in case of DLT serverless as well, will it be internally running two different clusters - which will invariably result in higher cost to job clusters. Basically wanted to understand that whether DLT serverless will consume additional DBUs due to maintenance or other management overheads when compared to Jobs compute - serverless?

The maintenance compute in DLT will only run every once in a while and optimize (and zorder, if set) your streaming tables. That usually doesn't take too long so the costs are usually negligible when looking at the big picture.

And regarding serverless: It shouldn't matter too much how many serverless computes you use, as serverless DBU pricing is based on the processing power needed for your (total) workload. So, running a workload in one serverless compute should roughly come out the same as splitting it up and running two (half-sized) workloads in parallel.