Data Engineering

Forum Posts

Sorted by:

by alejandrofm • Valued Contributor

10-30-2022 12:58:42 PM

890 Views
0 replies
2 kudos

How can I know if an instance has fallen back to On-demand?

Hi, I have several clusters, some with a 45% max spot price, some more important with a higher value. Want to know what is the best way to configure this but cannot find anything (a value of how many nodes of the last run were On-demand will do the t...

Data Engineering

890 Views
0 replies
2 kudos

10-30-2022 12:58:42 PM

by isaac_gritz • Databricks Employee

08-23-2022 12:05:12 AM

1041 Views
0 replies
3 kudos

Optimize Azure VM / AWS EC2 / GKE Cloud Infrastructure Costs

Tips on Reducing Cloud Compute Infrastructure Costs for Azure VM, AWS EC2, and GCP GKE on DatabricksDatabricks takes advantage of the latest Azure VM / AWS EC2 / GKE VM/instance types to ensure you get the best price performance for your workloads on...

Data Engineering

1041 Views
0 replies
3 kudos

08-23-2022 12:05:12 AM

by Anonymous • Not applicable

06-14-2021 2:26:10 PM

2585 Views
2 replies
2 kudos

Resolved! Spot instances - Best practice

We are having difficulties running our jobs with spot instances that get re-claimed by AWS during shuffles. Do we have any documentation / best-practices around this? We went through this article but is there anything else to keep in mind?

Data Engineering

2585 Views
2 replies
2 kudos

06-14-2021 2:26:10 PM

View Replies

Latest Reply

User16783853906
Contributor III

06-25-2021 3:08:38 PM

2 kudos

Due to the recent changes in AWS spot market place , legacy techniques like higher spot bid price (>100%) are ineffective to retain the acquired spot node and the instances can be lost in 2 minutes notice causing workloads to fail.To mitigate this, w...

2 kudos

06-25-2021 3:08:38 PM

1 More Replies

by User16783853906 • Contributor III

06-07-2021 12:05:03 PM

3315 Views
3 replies
0 kudos

Resolved! Frequent spot loss of driver nodes resulting in failed jobs when using spot fleet pools

When using spot fleet pools to schedule jobs, driver and worker nodes are provisioned from the spot pools and we are noticing jobs failing with the below exception when there is a driver spot loss. Share best practices around using fleet pools with 1...

Data Engineering

3315 Views
3 replies
0 kudos

06-07-2021 12:05:03 PM

View Replies

Latest Reply

User16783853906
Contributor III

06-23-2021 2:20:55 PM

0 kudos

In this scenario, the driver node is reclaimed by AWS. Databricks started preview of hybrid pools feature which would allow you to provision driver node from a different pool. We recommend using on-demand pool for driver node to improve reliability i...

0 kudos

06-23-2021 2:20:55 PM

2 More Replies

by User16826992666 • Valued Contributor

06-15-2021 8:58:58 PM

2484 Views
1 replies
0 kudos

What happens if a spot instance worker is lost in the middle of a query?

Does the query have to be re-run from the start, or can it continue? Trying to evaluate what risk there is by using spot instances for production jobs

Data Engineering

2484 Views
1 replies
0 kudos

06-15-2021 8:58:58 PM

View Replies

Latest Reply

User16826992666
Valued Contributor

06-15-2021 9:27:59 PM

0 kudos

If a spot instance is reclaimed in the middle of a job, then spark will treat it as a lost worker. The spark engine will automatically retry the tasks from the lost worker on other available workers. So the query does not have to start over if indivi...

0 kudos

06-15-2021 9:27:59 PM

Databricks Community

How can I know if an instance has fallen back to On-demand?

Optimize Azure VM / AWS EC2 / GKE Cloud Infrastructure Costs

Resolved! Spot instances - Best practice

Resolved! Frequent spot loss of driver nodes resulting in failed jobs when using spot fleet pools

What happens if a spot instance worker is lost in the middle of a query?