Databricks Community

sashikanth · ‎10-18-2024

How to increase the resource efficiency in databricks jobs?

We see that idle cost is more than the utilization cost. Any guidelines will be helpful

Please share some examples.

shashank853 · ‎10-18-2024

Hi,

You can check below components for Managing Idle Costs:

Auto-scaling and Auto-termination:

Auto-scaling: Enable auto-scaling to dynamically adjust the number of worker nodes based on job requirements. This helps in scaling up during high demand and scaling down during low demand.
Auto-termination: Configure clusters to automatically terminate after a set period of inactivity. This prevents idle clusters from incurring unnecessary costs.

Use Job Compute:

Job Compute vs. All-Purpose Compute: Running non-interactive workloads on job compute instances is more cost-effective than using all-purpose compute instances.

Choose the Right Instance Type

Instance Type Selection: Select instance types based on workload characteristics. For example, use memory-optimized instances for ML tasks and compute-optimized instances for streaming workloads.

Efficient Compute Size:

Compute Sizing Considerations: Consider factors like total executor cores, memory, and local storage when sizing your compute. This ensures optimal resource utilization and cost efficiency.

Design Cost-effective Workloads:

Balance Always-on and Triggered Streaming: For use cases that do not require immediate data updates, schedule fewer runs to reduce costs.

Check the doc: https://docs.databricks.com/en/lakehouse-architecture/cost-optimization/best-practices.html

-werners- · ‎10-18-2024

My main improvements are:

- use singlenode job clusters for small data
- cluster reuse (so use the same job cluster for multiple tasks, in parallel or serial)
- use autoscaling only when it is very hard to find a good fixed sizing, otherwise go for fixed size.

Databricks Community

Job optimization

🌟 Community Pulse: Your Weekly Roundup! July 06 – 12, 2026

Upcoming Community BrickTalk | Sports Analytics: Turning Tracking Data into Real-Time AI Decisions

How to Optimize Your Content for GEO: Best Practices for Writing Discoverable Community Content

Solution Accelerator Series | Building Common Sense Product Recommendations With LLMs

Databricks Community Fellows – June 2026 Recap