2 weeks ago
Since Databricks does not provide individual cost breakdowns for components like Jobs or Compute, we aim to create a custom usage dashboard leveraging APIs to display the cost of each job run across Databricks, Azure Data Factory (ADF), or serverless environments. If anyone has experience with a similar use case or has implemented such a solution, your insights would be greatly appreciated!
2 weeks ago
Hey!
Databricks recently introduced system tables that provide job cost analysis, which might help achieve your goal without building a custom solution from scratch.
These tables can offer insights into job run costs across Databricks, and you might be able to correlate them with external tools like Azure Data Factory or serverless environments by combining metadata and billing data.
You can check out the official documentation here: Databricks System Tables - Jobs Cost
I hope you find this helpful. ๐
2 weeks ago
Hey, thanks for sharing the link! I did check out the system tables in the official documentation, but they donโt seem to include the cost of jobs running on SQL warehouses and all-purpose compute. Iโm working on a dashboard to get a complete view of all costs, including those.
2 weeks ago
Hey!
Youโre right; the system tables might not provide a full breakdown of costs. The approach you take to track costs also depends on the type of cluster youโre using.
If youโre using a Classic or Pro cluster in SQL Warehouse or all-purpose, youโll likely need to create a custom dashboardโ for example, using Grafana. This can pull cost data directly from your cloud provider (AWS, Azure, GCP) to track the compute machine costs and combine it with Databricks-specific costs such as DBU usage and storage.
Hope that helps! ๐
2 weeks ago
Thanks. Any recommendations on how Grafana can retrieve job, serverless, and SQL warehouse cost data from Azure? Are there specific APIs or other mechanisms available for extracting cost information?
2 weeks ago
Hey,
Yes, I am not Azure expert but, Databricks REST API can help you extract usage data for serverless resources, allowing you to integrate this information into custom dashboards or external tools like Grafana.
On the Azure side, costs related to will appear under the Databricks resource category in Azure Cost Management + Billing. However, Azure does not break down Databricks-managed workloads (like serverless SQL warehouses) into specific line items, as Databricks abstracts the underlying infrastructure. Youโll see an aggregate charge for Databricks usage.
You can configure Azure Cost Management to export aggregated Databricks costs to a storage account or a Log Analytics Workspace. This exported data can then be combined with usage data from the Databricks API to build a more detailed view of costs.
๐
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group