Louis_Frolio
Databricks Employee
Databricks Employee

The Delta statistics is mostly used by the Delta Protocol to do its thing.  Analyze Table ...  is going to be extremely beneficial to Spark and the Catalyst Optimizer.  You need both. Delta is automatic. Analyze table is done manually or, if you are using Unity Catalog with Managed tables it will be automatic there too. See below

 

Unity Catalog and managed tables in Databricks, "compute statistics" is automatically handled. This automation is part of Databricks’ predictive optimization feature, which is enabled by default for Unity Catalog managed tables as of July 2025 for all accounts.

Key points:

- Automatic statistics collection: Unity Catalog managed tables automatically gather essential statistics, such as minimum and maximum values for columns. This enables efficient data skipping and join strategies, improving query performance and reducing computational overhead.
- Predictive optimization:*Predictive optimization automatically runs `ANALYZE` (i.e., compute statistics) on Unity Catalog managed tables. This means you do not need to manually run `ANALYZE TABLE ... COMPUTE STATISTICS` for these tables unless you want to force an update or target specific columns.
- When statistics are collected: With predictive optimization enabled, statistics are collected automatically when data is written to a managed table. The system also identifies when maintenance operations, like statistics collection, are needed and runs them as necessary.
- Manual intervention: While automatic collection is the default, you can still use the `ANALYZE TABLE` command to manually compute or refresh statistics if required for specific scenarios.

**Summary table:**

| Table Type | Statistics Collection | Manual ANALYZE Needed? |
|-----------------------------------|----------------------|------------------------------|
| Unity Catalog Managed Table | Automatic | Optional (for manual refresh)|
| Unity Catalog External Table | Partial/Manual | Often required |
| Legacy Hive Metastore Managed | Manual | Required |

Conclusion: For Unity Catalog managed tables, Databricks automatically computes statistics as part of its built-in optimization features, so manual intervention is generally unnecessary unless you have a specific need.

 

Hope this helps. Lou.