no of partitions will be
1000*1024/128=8000
So my question is, all these 8000 partitions combined will be 1000 GB.
And I am creating a data frame from this data.
How this data is loaded. It will require to somehow hold the data In memory.
So I am just trying to understand what happens at backend, how the data is read( how the nodes manages this load)