Error: The spark driver has stopped unexpectedly and is restarting. Your notebook will be automatically reattached.

SaraCorralLou
New Contributor III

What is the problem?

I am getting this error every time I run a python notebook on my Repo in Databricks.

Background

The notebook where I am getting the error is a notebook that creates a dataframe and the last step is to write the dataframe to a Delta table already created in Databricks.

The dataframe created has approximately 16,000,000 records. 

In thenotebook I don't have any display(), print(), ... command, only the creation of this dataframe through other created dataframes.

This notebook with the same amount of records was working a few days ago but now I am getting that error. I have been reading in other discussions in the chat and have seen that it could be a memory problem so I have taken the following steps:  

  • I have changed the configuration of the Cluster where I am running it. This configuration includes: 
    • Worker type: Standard_DS4_V2 28GB Memory, 8 Cores
    • Driver type: Standard_DS5_V2 56GB Memory, 16 Cores
    • Min workers: 2 and Max workers:8
    • spark.databricks.io.cache.enabled true
    • spark.databricks.driver.disableScalaOutput true
  • I have run the notebook as part of a Job in order to use a Job Cluster. 
  • I deleted the part of the code where the data is copied into the existing delta table to check that the problem was not in that part and I still got the same error.
  • I have tried restarting the cluster, stop attaching it and attach it to my notebook.

Could you help me? I don't know if the problem comes from the cluster configuration or from where, because a few days ago I was able to run the notebook without any problem. 

Thank you so much in advance, I look forward to hearing from you.