Job optimization

sashikanth — Fri, 18 Oct 2024 11:02:35 GMT

How to increase the resource efficiency in databricks jobs?

We see that idle cost is more than the utilization cost. Any guidelines will be helpful

Please share some examples.

Re: Job optimization

shashank853 — Fri, 18 Oct 2024 11:07:59 GMT

Hi,

You can check below components for Managing Idle Costs:

Auto-scaling and Auto-termination:

Auto-scaling: Enable auto-scaling to dynamically adjust the number of worker nodes based on job requirements. This helps in scaling up during high demand and scaling down during low demand.
Auto-termination: Configure clusters to automatically terminate after a set period of inactivity. This prevents idle clusters from incurring unnecessary costs.

Use Job Compute:

Job Compute vs. All-Purpose Compute: Running non-interactive workloads on job compute instances is more cost-effective than using all-purpose compute instances.

Choose the Right Instance Type

Instance Type Selection: Select instance types based on workload characteristics. For example, use memory-optimized instances for ML tasks and compute-optimized instances for streaming workloads.

Efficient Compute Size:

Compute Sizing Considerations: Consider factors like total executor cores, memory, and local storage when sizing your compute. This ensures optimal resource utilization and cost efficiency.

Design Cost-effective Workloads:

Balance Always-on and Triggered Streaming: For use cases that do not require immediate data updates, schedule fewer runs to reduce costs.

Check the doc: https://docs.databricks.com/en/lakehouse-architecture/cost-optimization/best-practices.html

Re: Job optimization

-werners- — Fri, 18 Oct 2024 11:50:52 GMT

My main improvements are:

- use singlenode job clusters for small data
- cluster reuse (so use the same job cluster for multiple tasks, in parallel or serial)
- use autoscaling only when it is very hard to find a good fixed sizing, otherwise go for fixed size.

topic Re: Job optimization in Data Engineering

Job optimization

Re: Job optimization

Re: Job optimization