cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to Read Terabytes of data in Databricks

Abhijeet
New Contributor III

I want to read 1000 GB data. As in spark we do in memory transformation. Do I need worker nodes with combined size of 1000 GB.

Also Just want to understand if will reading we store 1000 GB in memory. So how the Cache Data frame is different from the above case

5 REPLIES 5

Aviral-Bhardwaj
Esteemed Contributor III

in the master and slave node system

your data chunk will be divided into 128 MB.

so 1000/128= 7.8125

so it will require creating 7-8 partitions of that data so you don't need a 1000GB cluster 2-3 nodes with 10-30 GB size I will work fine

Let me know if I am wrong here

Thanks

Aviral Bhardwaj

no of partitions will be

1000*1024/128=8000

So my question is, all these 8000 partitions combined will be 1000 GB.

And I am creating a data frame from this data.

How this data is loaded. It will require to somehow hold the data In memory.

So I am just trying to understand what happens at backend, how the data is read( how the nodes manages this load)

Ajay-Pandey
Esteemed Contributor II

Hi @Abhijeet Singh​ below blog might help you-

Link

Kaniz
Community Manager
Community Manager

Hi @Abhijeet Singh​(Customer)​ , We haven’t heard from you since the last response from @Ajay Pandey​ and I was checking back to see if his suggestions helped you.

Or else, If you have any solution, please do share that with the community as it can be helpful to others.

Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

Abhijeet
New Contributor III

None of the answers are relevant to me

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.