<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic FOR COLUMNS Not Supported in Delta ANALYZE? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/for-columns-not-supported-in-delta-analyze/m-p/147685#M52769</link>
    <description>&lt;P&gt;Hi everyone,&lt;/P&gt;&lt;P&gt;I’m running into the following error when trying to run ANALYZE on a Delta table:&lt;/P&gt;&lt;PRE&gt;&lt;BR /&gt;[INVALID_SQL_SYNTAX.ANALYZE_TABLE_DELTA_STATS_UNEXPECTED_TOKEN]
Invalid SQL syntax: ANALYZE TABLE(S) ... COMPUTE DELTA STATISTICS FOR 
doesn't support: FOR ALL COLUMNS, FOR COLUMNS, NOSCAN, and PARTITION clauses.
SQLSTATE: 42000&lt;/PRE&gt;&lt;P&gt;using&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;   ANALYZE TABLE your_catalog.your_schema.your_table
   COMPUTE DELTA STATISTICS
   FOR COLUMNS col33, col34, col35, col36, col37, col38, col39, col40, col41, col42;&lt;/PRE&gt;&lt;P&gt;My intent was to compute statistics only for a subset of columns (using FOR COLUMNS) after recently adding some new columns to dataSkippingStats.&lt;/P&gt;&lt;H3&gt;My questions:&lt;/H3&gt;&lt;OL&gt;&lt;LI&gt;Does Delta Lake support column-level ANALYZE at all (FOR COLUMNS / FOR ALL COLUMNS)?&lt;/LI&gt;&lt;LI&gt;If not, what is the correct way to recompute stats only for the newly added columns?&lt;/LI&gt;&lt;LI&gt;Is re-running ANALYZE TABLE table_name COMPUTE DELTA STATISTICS the only option?&lt;BR /&gt;(Even if it recomputes stats for all columns?)&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Additional context:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Initially had &lt;STRONG&gt;32 columns&lt;/STRONG&gt; in dataSkippingStats → stats were persisted for those.&lt;/LI&gt;&lt;LI&gt;Added &lt;STRONG&gt;few new columns (suppose 15 cols)&lt;/STRONG&gt; recently.&lt;/LI&gt;&lt;LI&gt;Want to compute stats only for the newly added ones, if possible.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Any clarification would be appreciated!&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
    <pubDate>Mon, 09 Feb 2026 12:21:32 GMT</pubDate>
    <dc:creator>pooja_bhumandla</dc:creator>
    <dc:date>2026-02-09T12:21:32Z</dc:date>
    <item>
      <title>FOR COLUMNS Not Supported in Delta ANALYZE?</title>
      <link>https://community.databricks.com/t5/data-engineering/for-columns-not-supported-in-delta-analyze/m-p/147685#M52769</link>
      <description>&lt;P&gt;Hi everyone,&lt;/P&gt;&lt;P&gt;I’m running into the following error when trying to run ANALYZE on a Delta table:&lt;/P&gt;&lt;PRE&gt;&lt;BR /&gt;[INVALID_SQL_SYNTAX.ANALYZE_TABLE_DELTA_STATS_UNEXPECTED_TOKEN]
Invalid SQL syntax: ANALYZE TABLE(S) ... COMPUTE DELTA STATISTICS FOR 
doesn't support: FOR ALL COLUMNS, FOR COLUMNS, NOSCAN, and PARTITION clauses.
SQLSTATE: 42000&lt;/PRE&gt;&lt;P&gt;using&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;   ANALYZE TABLE your_catalog.your_schema.your_table
   COMPUTE DELTA STATISTICS
   FOR COLUMNS col33, col34, col35, col36, col37, col38, col39, col40, col41, col42;&lt;/PRE&gt;&lt;P&gt;My intent was to compute statistics only for a subset of columns (using FOR COLUMNS) after recently adding some new columns to dataSkippingStats.&lt;/P&gt;&lt;H3&gt;My questions:&lt;/H3&gt;&lt;OL&gt;&lt;LI&gt;Does Delta Lake support column-level ANALYZE at all (FOR COLUMNS / FOR ALL COLUMNS)?&lt;/LI&gt;&lt;LI&gt;If not, what is the correct way to recompute stats only for the newly added columns?&lt;/LI&gt;&lt;LI&gt;Is re-running ANALYZE TABLE table_name COMPUTE DELTA STATISTICS the only option?&lt;BR /&gt;(Even if it recomputes stats for all columns?)&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Additional context:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Initially had &lt;STRONG&gt;32 columns&lt;/STRONG&gt; in dataSkippingStats → stats were persisted for those.&lt;/LI&gt;&lt;LI&gt;Added &lt;STRONG&gt;few new columns (suppose 15 cols)&lt;/STRONG&gt; recently.&lt;/LI&gt;&lt;LI&gt;Want to compute stats only for the newly added ones, if possible.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Any clarification would be appreciated!&lt;/P&gt;&lt;P&gt;Thanks!&lt;/P&gt;</description>
      <pubDate>Mon, 09 Feb 2026 12:21:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/for-columns-not-supported-in-delta-analyze/m-p/147685#M52769</guid>
      <dc:creator>pooja_bhumandla</dc:creator>
      <dc:date>2026-02-09T12:21:32Z</dc:date>
    </item>
    <item>
      <title>Re: FOR COLUMNS Not Supported in Delta ANALYZE?</title>
      <link>https://community.databricks.com/t5/data-engineering/for-columns-not-supported-in-delta-analyze/m-p/147724#M52771</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/170125"&gt;@pooja_bhumandla&lt;/a&gt;&amp;nbsp;, I did some digging and here is what I found.&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;Short answer&lt;/P&gt;
&lt;P class="p1"&gt;Delta Lake does not support column selection (FOR COLUMNS or FOR ALL COLUMNS) when using ANALYZE … COMPUTE DELTA STATISTICS. Those clauses apply only to optimizer (CBO) statistics collected via COMPUTE STATISTICS, not to Delta’s data-skipping statistics.&lt;/P&gt;
&lt;P class="p1"&gt;If you want Delta data-skipping stats on a specific set of columns, the supported mechanism is the table property delta.dataSkippingStatsColumns. You define the exact list of columns there, then run ANALYZE TABLE … COMPUTE DELTA STATISTICS to (re)compute stats for those columns across existing files. There is no syntax today to target only newly added columns—recomputation is table-wide for the configured column list.&lt;/P&gt;
&lt;P class="p1"&gt;Yes—after updating the stats-column configuration, re-running ANALYZE TABLE … COMPUTE DELTA STATISTICS is the correct and supported approach. Updating the property alone does not backfill existing data.&lt;/P&gt;
&lt;P class="p1"&gt;What to do in your case&lt;/P&gt;
&lt;OL start="1"&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;Decide on the full set of columns that should have Delta data-skipping stats. This must include both the previously indexed columns and the newly added ones. When delta.dataSkippingStatsColumns is set, only the columns in that list will receive file-level min/max stats, both going forward and when recomputed.&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;Apply the property and recompute Delta stats:&lt;/P&gt;
&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE&gt;-- Include BOTH the previously indexed columns and the new columns
ALTER TABLE your_catalog.your_schema.your_table
  SET TBLPROPERTIES (
    'delta.dataSkippingStatsColumns' = 'col1,col2,...,col32,col33,col34,...'
  );

-- Backfill Delta (data-skipping) stats for the configured columns
ANALYZE TABLE your_catalog.your_schema.your_table
  COMPUTE DELTA STATISTICS;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P class="p1"&gt;Notes and clarifications&lt;/P&gt;
&lt;P class="p1"&gt;Your error is expected. ANALYZE … COMPUTE DELTA STATISTICS explicitly rejects FOR COLUMNS, FOR ALL COLUMNS, NOSCAN, and PARTITION clauses. Column selection for Delta stats is controlled exclusively via the table property.&lt;/P&gt;
&lt;P class="p1"&gt;If what you actually need are optimizer (CBO) stats, column-level syntax is supported there:&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;ANALYZE TABLE … COMPUTE STATISTICS FOR COLUMNS …&lt;/P&gt;
&lt;/LI&gt;
&lt;LI&gt;
&lt;P class="p1"&gt;ANALYZE TABLE … COMPUTE STATISTICS FOR ALL COLUMNS&lt;/P&gt;
&lt;/LI&gt;
&lt;/UL&gt;
&lt;P class="p1"&gt;Just note that those collect optimizer stats, not Delta data-skipping stats.&lt;/P&gt;
&lt;P class="p1"&gt;Also, changing delta.dataSkippingStatsColumns does not automatically recompute stats on existing files. That’s why running ANALYZE … COMPUTE DELTA STATISTICS afterward is required to backfill.&lt;/P&gt;
&lt;P class="p1"&gt;Why this works&lt;/P&gt;
&lt;P class="p1"&gt;Delta data-skipping stats are governed by table properties (first 32 columns by default, or an explicit list you provide) and are recomputed via COMPUTE DELTA STATISTICS. There is currently no supported FOR COLUMNS equivalent for Delta stats.&lt;/P&gt;
&lt;P class="p1"&gt;Optimizer stats and Delta stats are separate systems. Column-level ANALYZE only applies to optimizer stats, not Delta data-skipping.&lt;/P&gt;
&lt;P class="p1"&gt;Cheers, Louis&lt;/P&gt;</description>
      <pubDate>Mon, 09 Feb 2026 15:02:05 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/for-columns-not-supported-in-delta-analyze/m-p/147724#M52771</guid>
      <dc:creator>Louis_Frolio</dc:creator>
      <dc:date>2026-02-09T15:02:05Z</dc:date>
    </item>
  </channel>
</rss>

