- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-23-2021 09:19 PM
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-23-2021 09:22 PM
by default a delta table has stats collected on the first 32 columns. This setting can be configured using the following.
set spark.databricks.delta.properties.defaults.dataSkippingNumIndexedCols = 3
However there's a time trade-off to having a large number of columns set for stats collection. You typically want to collect stats on column that are used in filter, where clauses joins and on which you tend to performance aggregations.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-23-2021 09:22 PM
by default a delta table has stats collected on the first 32 columns. This setting can be configured using the following.
set spark.databricks.delta.properties.defaults.dataSkippingNumIndexedCols = 3
However there's a time trade-off to having a large number of columns set for stats collection. You typically want to collect stats on column that are used in filter, where clauses joins and on which you tend to performance aggregations.

