cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Spark doesn't register executors when new workers are allocated

ivanychev
Contributor II

Our pipelines sometimes get stuck (example).

Some workers get decommissioned due to spot termination and then the new workers get added.

Screenshot 2023-12-11 at 11.12.05.png

โ€ƒHowever, after (1) Spark doesn't notice new executors:

Screenshot 2023-12-11 at 11.08.56.png

โ€ƒAnd I don't know why. I don't understand how to debug this, but here're some of my observations:

* The init script logs of the workers, which Spark doesn't notice, are fine, they complete successfully.

* The driver logs don't show anything significant after old executors get decomissioned. Driver simply doesn't notice new executors

Screenshot 2023-12-11 at 11.48.50.png

How do Iโ€ƒ debug this and what can be the issue?

 

Sergey
1 REPLY 1

shan_chandra
Databricks Employee
Databricks Employee

@ivanychev  - Firstly, New workers are added and spark notice them hence, there is an init script logging in the event log stating the init script ran on the newly added workers.  For debugging, please check the Spark UI - executor tab. 

Secondly, For Spot Instance termination, This is mostly by the cloud provider and spot instance price fluctuation. you can ideally use hybrid clusters (with spot fall back on demand) flag set on the cluster configuration page. 

Reference: https://docs.databricks.com/en/compute/cluster-config-best-practices.html#on-demand-and-spot-instanc...

Thanks, Shan

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group