Databricks Community

Gmera · ‎06-11-2025

Do you have information that helps me optimize costs and follow up?

Athul97 · ‎06-12-2025

1.Use Jobs clusters instead of All-Purpose clusters
2.Enable Auto-Termination to shut down idle clusters
3.Archive cold data to low-cost storage tiers (e.g., Azure Blob cool tier, AWS S3 Glacier)
4.Run jobs in off-peak hours to leverage spot pricing
5.Use Photon Engine for faster, cheaper queries
6.Set spending budgets and alerts in cloud cost tools
7.Regularly review cluster & job usage reports

View solution in original post

Athul97 · ‎06-12-2025

1.Use Jobs clusters instead of All-Purpose clusters
2.Enable Auto-Termination to shut down idle clusters
3.Archive cold data to low-cost storage tiers (e.g., Azure Blob cool tier, AWS S3 Glacier)
4.Run jobs in off-peak hours to leverage spot pricing
5.Use Photon Engine for faster, cheaper queries
6.Set spending budgets and alerts in cloud cost tools
7.Regularly review cluster & job usage reports

jameshughes · ‎06-14-2025

@Athul97 provided a pretty solid list of best practices. To go deeper into Budgets & Alerts, I have found a lot of good success with the Consumption and Budget feature in the Databricks Account Portal under the Usage menu. Once you embed tagging into all Databricks assets, you can really get a good picture of usage and can get a handle of where the spend is occurring. This can obviously get married up with the general cloud consumption costs for things like Storage and Networking, but gives you more granular reporting inside your workspaces.

The other area where I see opportunity is setting up some type of engineering code review and optimization process. I still see a lot of poor development practices where incorrect usage of libraries or poor data processing algorithms cause unnecessary cluster cycles. I recently audited a customer's worst performing jobs and made a number of coding suggestions that led to significant reductions in execution times. Many of the jobs that ran for hours, now complete in 30-45 minutes without any changes to the cluster configurations.

Databricks Community

Cost

Join Us as a Local Community Builder!

Find Sensitive Data at Scale with Data Classification in Unity Catalog

Solution Accelerator Series | #6 - Adverse Drug Event Detection

Announcing Backfill Runs in Lakeflow Jobs for Higher Quality Downstream Data

🚀 New: Databricks Interactive Architecture Design Workshops

Databricks DevConnect I Washington D.C.