Hello @pooja_bhumandla , I did some digging and here is what I found.
Short answer
Delta Lake does not support column selection (FOR COLUMNS or FOR ALL COLUMNS) when using ANALYZE … COMPUTE DELTA STATISTICS. Those clauses apply only to optimizer (CBO) statistics collected via COMPUTE STATISTICS, not to Delta’s data-skipping statistics.
If you want Delta data-skipping stats on a specific set of columns, the supported mechanism is the table property delta.dataSkippingStatsColumns. You define the exact list of columns there, then run ANALYZE TABLE … COMPUTE DELTA STATISTICS to (re)compute stats for those columns across existing files. There is no syntax today to target only newly added columns—recomputation is table-wide for the configured column list.
Yes—after updating the stats-column configuration, re-running ANALYZE TABLE … COMPUTE DELTA STATISTICS is the correct and supported approach. Updating the property alone does not backfill existing data.
What to do in your case
-
Decide on the full set of columns that should have Delta data-skipping stats. This must include both the previously indexed columns and the newly added ones. When delta.dataSkippingStatsColumns is set, only the columns in that list will receive file-level min/max stats, both going forward and when recomputed.
-
Apply the property and recompute Delta stats:
-- Include BOTH the previously indexed columns and the new columns
ALTER TABLE your_catalog.your_schema.your_table
SET TBLPROPERTIES (
'delta.dataSkippingStatsColumns' = 'col1,col2,...,col32,col33,col34,...'
);
-- Backfill Delta (data-skipping) stats for the configured columns
ANALYZE TABLE your_catalog.your_schema.your_table
COMPUTE DELTA STATISTICS;
Notes and clarifications
Your error is expected. ANALYZE … COMPUTE DELTA STATISTICS explicitly rejects FOR COLUMNS, FOR ALL COLUMNS, NOSCAN, and PARTITION clauses. Column selection for Delta stats is controlled exclusively via the table property.
If what you actually need are optimizer (CBO) stats, column-level syntax is supported there:
Just note that those collect optimizer stats, not Delta data-skipping stats.
Also, changing delta.dataSkippingStatsColumns does not automatically recompute stats on existing files. That’s why running ANALYZE … COMPUTE DELTA STATISTICS afterward is required to backfill.
Why this works
Delta data-skipping stats are governed by table properties (first 32 columns by default, or an explicit list you provide) and are recomputed via COMPUTE DELTA STATISTICS. There is currently no supported FOR COLUMNS equivalent for Delta stats.
Optimizer stats and Delta stats are separate systems. Column-level ANALYZE only applies to optimizer stats, not Delta data-skipping.
Cheers, Louis