Databricks Community

abd · ‎06-28-2022

If my cluster memory is 1GB for example and my data is 1TB how Spark will handle it?

If it is in memory computing how does it handles the data that is greater than the memory size ?

Cedric · ‎06-29-2022

Hi @Abdullah Durrani,

Spark workers will spill the data on disk if the dataset is larger than the memory size.

I'd advise you to follow the best practices page https://docs.databricks.com/clusters/cluster-config-best-practices.html#cluster-sizing-consideration... to determine what cluster size you should configure for your use case.

abd · ‎06-29-2022

@Kaniz Fatma @Cedric Law Hing Ping

Cedric · ‎06-29-2022

Hi @Abdullah Durrani,

Spark workers will spill the data on disk if the dataset is larger than the memory size.

I'd advise you to follow the best practices page https://docs.databricks.com/clusters/cluster-config-best-practices.html#cluster-sizing-consideration... to determine what cluster size you should configure for your use case.

How spark will handles 1TB data if it has cluster of 1GB memory ?