โ06-28-2022 10:03 PM
If my cluster memory is 1GB for example and my data is 1TB how Spark will handle it?
If it is in memory computing how does it handles the data that is greater than the memory size ?
โ06-29-2022 05:33 AM
Hi @Abdullah Durraniโ,
Spark workers will spill the data on disk if the dataset is larger than the memory size.
I'd advise you to follow the best practices page https://docs.databricks.com/clusters/cluster-config-best-practices.html#cluster-sizing-consideration... to determine what cluster size you should configure for your use case.
View solution in original post
โ06-29-2022 05:12 AM
@Kaniz Fatmaโ @Cedric Law Hing Pingโ
never-displayed
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!