no of partitions will be

1000*1024/128=8000

So my question is, all these 8000 partitions combined will be 1000 GB.

And I am creating a data frame from this data.

How this data is loaded. It will require to somehow hold the data In memory.

So I am just trying to understand what happens at backend, how the data is read( how the nodes manages this load)