Liquid clustering not improved performance

SusmithaBadam — Mon, 04 Aug 2025 09:27:20 GMT

Hi There,

I have a table of 160 GB with partition applied on country and yearmonth columns, I maintain a previous history of 6 years and replace the partitions (latest 2 months) to add the new data.

I use overwrite mode to replace the effected partitions. The entire ETL process executes without any failure but with heavy skewness in data partitions. I did a POC with liquid clustering by reducing table size to 45GB, but did not see much improvement.

Observation:

Select with group by on the cluster table with Optimize takes 39sec where as the partitioned table takes 2 sec. Could see a better write but read performance is much degraded.

I have attached an excel with read/write performance difference. I want to utilize the liquid clustering advantages but no luck.

Re: Liquid clustering not improved performance

Renu_ — Mon, 04 Aug 2025 14:36:25 GMT

Hi @SusmithaBadam, based on your use case, partitioned tables are performing better because they work kind of like labeled folders. When you group by, it can quickly go to the exact folder instead of scanning everything, so it’s much faster.

Liquid clustering, on the other hand, shines when you need to filter on other detailed (high-cardinality) columns, but for your group-by queries on the partition columns, it can’t take that shortcut. So for your current setup, sticking with partitioned tables makes more sense performance-wise.

topic Re: Liquid clustering not improved performance in Data Engineering

Liquid clustering not improved performance

Re: Liquid clustering not improved performance