Auto optimize config

Anonymous
Not applicable

Does auto-optimize work for existing tables only or will it work for both existing and new tables when we enable at the cluster config level?

Mooune_DBU
Databricks Employee
Databricks Employee

If you're referring to Delta Tables, Auto-Optimize will work for both.

For new tables:

CREATE TABLE student (id INT, name STRING, age INT) TBLPROPERTIES (delta.autoOptimize.optimizeWrite = true, delta.autoOptimize.autoCompact = true)

For existing tables:

ALTER TABLE [table_name | delta.`<table-path>`] SET TBLPROPERTIES (delta.autoOptimize.optimizeWrite = true, delta.autoOptimize.autoCompact = true)

If you want this to be the default behavior for all new tables, you can set it as a default setting:

set spark.databricks.delta.properties.defaults.autoOptimize.optimizeWrite = true;
set spark.databricks.delta.properties.defaults.autoOptimize.autoCompact = true;

View solution in original post

Sudip94
New Contributor II

I disabled auto optimize in my notebook but still there is auto optimize running when I query describe history {tablename}. I have used the following settings for setting it to false:

spark.conf.set('spark.databricks.delta.properties.defaults.autoOptimize.optimizeWrite', 'false')
spark.conf.set('spark.databricks.delta.properties.defaults.autoOptimize.autoCompact', 'false')
spark.conf.set('spark.databricks.delta.optimizeWrite.enabled', 'false')
 
SET TBLPROPERTIES ('autoOptimize' = 'false', 'targetFileSize' = '16mb', 'optimizeWrite' = 'false', 'autoCompact' = 'false')