topic Re: Pre-Partitioning a delta table to reduce suffling of wide operation in Data Engineering

Pre-Partitioning a delta table to reduce suffling of wide operation

Maatari — Tue, 13 Aug 2024 13:02:25 GMT

Assuming i need to perfom a groupby i.e. aggregation on a dataset stored in a delta table. If the delta table is partitioned by the field by which to group, can that have an impact on the suffling that the groupby would normally cause ?

As a connected question, one can ask is there any correlation between how a delta table is partitioned and how the data is put into the dataframe partition when loading the data ?

Re: Pre-Partitioning a delta table to reduce suffling of wide operation

Retired_mod — Wed, 14 Aug 2024 08:07:13 GMT

Hi @Maatari, Thanks for reaching out! Please review the responses and let us know which best addresses your question. Your feedback is valuable to us and the community.

If the response resolves your issue, kindly mark it as the accepted solution. This will help close the thread and assist others with similar queries.

We appreciate your participation and are here if you need further assistance!