Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
I am trying to re-optimize the a delta table with a max file size of 32 MB. But after changing spark.databricks.delta.optimize.maxFileSize and trying to optimize a partition, it doesn't split larger files to smaller ones. How can i get it to work.
spark.databricks.delta.optimize.maxFileSize controls the target size to binpack files when you run OPTIMIZE command. But it will not split larger files to smaller ones today. File splitting happens when ZORDER is ran however.
spark.databricks.delta.optimize.maxFileSize controls the target size to binpack files when you run OPTIMIZE command. But it will not split larger files to smaller ones today. File splitting happens when ZORDER is ran however.
Join Us as a Local Community Builder!
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!