cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

SQL Databricks - Spot VMs (Cost Optimized)

Gvsmao
New Contributor III

Hello! I want to ask a question please!

Referring to Spot VMs with the "Cost Optimized" setting:

In the case of Endpoint X-Small, which are 2 workers, if I send 10 simultaneous queries and a worker is evicted, can I have an error in any of these queries? or just would I just have to wait for another worker to come up and then I would have a slow execution?

My fear is that there may be an error when a worker is evicted due to what the tip below says that if a spot instance is retrieved, queries running on that instance will need to be resubmitted

  

image 

Unfortunately I am unable to test this...so I came to try this information here on the forum

Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions

-werners-
Esteemed Contributor III

I think with spot instances there is always the chance of getting errors because of eviction.

If all workers are evicted the query state is probably also lost, unless Databricks SQL keeps this stored somewhere.

For job clusters that is the case anyway, perhaps it works different on databricks sql.

View solution in original post

7 REPLIES 7

-werners-
Esteemed Contributor III

I think with spot instances there is always the chance of getting errors because of eviction.

If all workers are evicted the query state is probably also lost, unless Databricks SQL keeps this stored somewhere.

For job clusters that is the case anyway, perhaps it works different on databricks sql.

Gvsmao
New Contributor III

Okay, thanks for the contribution @Werner Stinckens​ !

That's the point really, does Databricks SQL keep it stored somehow so that there are no errors on the worker's return?

I will have to really test.

Regards

-werners-
Esteemed Contributor III

I doubt that will be the case. It would mean databricks has to run some kind of permanent node that contains the query state in ram (or on disk).

Gvsmao
New Contributor III

I agree with you

Anonymous
Not applicable

@Gvsmao .​ - Hello there! My name is Piper, and I'm a moderator for Databricks. When you find your answer, would you come back to let us know and mark your answer as best, unless werner's answer helped the most?

Either way, we are interested!

Gvsmao
New Contributor III

@Piper Wilson​  Ok!

Anonymous
Not applicable

Thanks for the information, I will try to figure it out for more. Keep sharing such informative post.  

www.mygroundbiz.com

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group