cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Warehousing & Analytics
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Community. Share insights, tips, and best practices for leveraging data for informed decision-making.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Behaviour of ANALYZE command varying when using different clusters and table types.

yshah
New Contributor

Certain tables have this configuration enabled, whereas others do not have it.
Delta.checkpointPolicy=v2

This is affecting the behavior of the ANALYZE command

If flag is enabled : Table stats are not visible after doing the DESCRIBE command using SINGLE user cluster
If flag is disabled : Table stats visible in standard "table properties" as key-value pairs using SINGLE user cluster

Please help us understand why this change in behavior exists for different cluster types and for tables with this flag

 

1 ACCEPTED SOLUTION

Accepted Solutions

BigRoux
Databricks Employee
Databricks Employee

@yshah this is a great question. Let me explain what's happening:

The Delta Lake table property `delta.checkpointPolicy=v2` changes how and where table statistics are stored and displayed when you run ANALYZE and DESCRIBE TABLE commands.

Classic vs V2 Checkpoint Policy

With `delta.checkpointPolicy=classic`:
Table stats are saved in the transaction log and shown as key-value pairs in table properties, which you can readily see using DESCRIBE TABLEโ€”even on single-user clusters.

With `delta.checkpointPolicy=v2` enabled:
Stats are stored in optimized checkpoint files (such as manifests or sidecars), not as key-value pairs in table properties. As a result, DESCRIBE TABLE does not display these stats for tables with v2 checkpointing.

Why This Change Matters

The reason for this change is to boost performance and reduce metadata costsโ€”especially important for streaming and high-frequency workloads. However, it also means some legacy behaviors and tools that expect stats in table properties will no longer see them unless you use the classic policy.

Recommendation

If you need stats to show up in table properties for use with legacy workflows or third-party tools, stick with `delta.checkpointPolicy=classic`. If you prefer better metadata efficiency and don't require stats in table properties, v2 is recommended.

 

Hope this helps. Louis.

View solution in original post

1 REPLY 1

BigRoux
Databricks Employee
Databricks Employee

@yshah this is a great question. Let me explain what's happening:

The Delta Lake table property `delta.checkpointPolicy=v2` changes how and where table statistics are stored and displayed when you run ANALYZE and DESCRIBE TABLE commands.

Classic vs V2 Checkpoint Policy

With `delta.checkpointPolicy=classic`:
Table stats are saved in the transaction log and shown as key-value pairs in table properties, which you can readily see using DESCRIBE TABLEโ€”even on single-user clusters.

With `delta.checkpointPolicy=v2` enabled:
Stats are stored in optimized checkpoint files (such as manifests or sidecars), not as key-value pairs in table properties. As a result, DESCRIBE TABLE does not display these stats for tables with v2 checkpointing.

Why This Change Matters

The reason for this change is to boost performance and reduce metadata costsโ€”especially important for streaming and high-frequency workloads. However, it also means some legacy behaviors and tools that expect stats in table properties will no longer see them unless you use the classic policy.

Recommendation

If you need stats to show up in table properties for use with legacy workflows or third-party tools, stick with `delta.checkpointPolicy=classic`. If you prefer better metadata efficiency and don't require stats in table properties, v2 is recommended.

 

Hope this helps. Louis.