erigaud
Honored Contributor

Have you tried specifying the schema when creating the DataFrame ? Providing the right types can help with the memory. 

Furthemore, you could incrementally load your data to a bronze delta table instead of loading the full million rows at once. 

Hope this helps !