cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How spark will handles 1TB data if it has cluster of 1GB memory ?

abd
Contributor

If my cluster memory is 1GB for example and my data is 1TB how Spark will handle it?

If it is in memory computing how does it handles the data that is greater than the memory size ?

1 ACCEPTED SOLUTION

Accepted Solutions

Cedric
Valued Contributor
Valued Contributor

Hi @Abdullah Durrani​,

Spark workers will spill the data on disk if the dataset is larger than the memory size.

I'd advise you to follow the best practices page https://docs.databricks.com/clusters/cluster-config-best-practices.html#cluster-sizing-consideration... to determine what cluster size you should configure for your use case.

View solution in original post

3 REPLIES 3

abd
Contributor

@Kaniz Fatma​ @Cedric Law Hing Ping​ 

Cedric
Valued Contributor
Valued Contributor

Hi @Abdullah Durrani​,

Spark workers will spill the data on disk if the dataset is larger than the memory size.

I'd advise you to follow the best practices page https://docs.databricks.com/clusters/cluster-config-best-practices.html#cluster-sizing-consideration... to determine what cluster size you should configure for your use case.

Kaniz_Fatma
Community Manager
Community Manager

Hi @Abdullah Durrani​, Please check this S.0 link.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!