llvu
New Contributor III

Thank you for the fast reply, let me try this out!

I indeed was under the impression that caching the dataframe would improve performance instead of making it worse.

Would you happen to know why optimizing the saved tables does not give the optimal number of partitions? Or does it give the optimal number of partitions in terms of storage and not computational purposes?