Cedric
Databricks Employee
Databricks Employee

Hi @Abdullah Durrani​,

Spark workers will spill the data on disk if the dataset is larger than the memory size.

I'd advise you to follow the best practices page https://docs.databricks.com/clusters/cluster-config-best-practices.html#cluster-sizing-consideration... to determine what cluster size you should configure for your use case.

View solution in original post