12-03-2025 08:36 AM
Similar to the cloud infra calculators, is there a TCO calculator exist for Databricks?
Lets say we have the inputs such as Number of source tables, data pipelines (estimated number), data growth per day, transfromation complexity and target reports and number of users for analytical usage, is there a way to cacluate the +/- 50 or +/-100% cost estimation ?
I totally get the factors around consumption and DBUs, but to give a fair idea, is there any such calculator exist ? Or any way we could derive a sensible way, rather 'finger in the air'. Assume it to be a complete ELT/Analytics use-case.
12-03-2025 08:44 AM
[Cannot edit Q] so for simplicity, lets assume Serverless for Job compute and Serverless SQL Warehouse.
12-03-2025 09:00 AM
Hi @Raman_Unifeye ,
There's a pricing calculator that you can check:
12-04-2025 01:55 AM
@szymon_dybczak - I am aware of that calculator, however, the challenge is - how to even calculate the number of DBU it will consume based on the volume of data processing etc. The tool starts with the Infra and compute inputs. However, my question is if the input parameters are - the data volume, number of pipelines and transformation complexity, how to convert that into DBU consumption which can then be used in the calculator.
12-04-2025 02:27 AM
Hi @Raman_Unifeye ,
The thing is Databricks pricing is based on your compute usage.Storage, networking and related costs will vary depending on the services you choose and your cloud service provider.
I think you won't find such a tool because every workload is different. For example, processing a table that has hundreds of millions of rows can vary significantly between two data pipelines. In pipeline A, you may have very complex transformations, and the time spent computing them will greatly affect the DBU cost (compute usage).
Meanwhile, pipeline B may simply take the data and perform a straightforward insert without any transformations. The cost of such a pipeline will be much lower, even though the amount of data processed is similar.
What I’m trying to say is that you won’t find a tool that can reliably estimate DBU cost based solely on data volume. By understanding your environments and transformations, you can try to estimate it yourself, but you won’t find a generic solution that will accurately calculate it for you.