cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks job keep getting failed due to executor lost.

amitkmaurya
Contributor

Getting following error while saving a dataframe partitioned by two columns.

Job aborted due to stage failure: Task 5774 in stage 33.0 failed 4 times, most recent failure: Lost task 5774.3 in stage 33.0 (TID 7736) (13.2.96.110 executor 7): ExecutorLostFailure (executor 7 exited caused by one of the running tasks) Reason: Command exited with code 137

 Please help me why I am getting this error and how can this be solved.

Driver + executor 64gb/16cores

1 ACCEPTED SOLUTION

Accepted Solutions

amitkmaurya
Contributor

Hi, 

I have solved the problem with the same workers and driver.

In my case data skewness was the problem.

Adding repartition to the dataframe just before writing, evenly distributed the data across the nodes and this stage failure resolved.

Thanks @Retired_mod for your insoghts.

View solution in original post

1 REPLY 1

amitkmaurya
Contributor

Hi, 

I have solved the problem with the same workers and driver.

In my case data skewness was the problem.

Adding repartition to the dataframe just before writing, evenly distributed the data across the nodes and this stage failure resolved.

Thanks @Retired_mod for your insoghts.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group