Databricks Community

data-engineer-d · ‎07-03-2024

We enabled liquid clustering on one of the large tables (380GBs). This table goes many operations daily, which improved many folds after liquid clustering. However, after enabling liquid clustering and optimizing it number of files are increased.

Previously it had around 4300 files and now it shows 7900 files. Though table size is almost the same before and after.

It is clustered using two columns which are both in first 32 columns. How can we justify this increase in number of file sizes i.e decrease in data per file.

data-engineer-d · ‎07-09-2024

Thank you for detailed explanation @Retired_mod .

Databricks Community

Liquid Clustering - Number of files are increasing

Photos

Join Us as a Local Community Builder!

Business Intelligence in the Era of AI

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Databricks Community Champion - March 2025 - Takuya Omi

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Virtual Learning Festival: 9 April - 30 April