cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

What is the upper bound limit for dataSkippingNumIndexedCols, to keeps stats in delta log file?

chhavibansal
New Contributor III

Is there an upper bound of number that i can assign to delta.dataSkippingNumIndexedCols for computing statistics. Is there some tradeoff benchmark available for increasing this number beyond 32.

1 REPLY 1

Anonymous
Not applicable

@Chhavi Bansalโ€‹ :

The delta.dataSkippingNumIndexedCols configuration property controls the maximum number of columns that Delta Lake will build statistics on during data skipping. By default, this value is set to 32. There is no hard upper bound on the number that can be assigned to this configuration property, but setting it to a very large number can have a negative impact on performance and memory usage. The optimal value for this configuration property will depend on the characteristics of your data and the workload that you are running. Delta Lake documentation recommends setting delta.dataSkippingNumIndexedCols to be equal to or slightly larger than the number of columns that you expect to be commonly used in predicates for filtering data. You can also adjust this value based on the size of your data and the resources available to your cluster.

As for the tradeoff benchmark, I am not aware of any specific benchmark related to this configuration property. However, you can monitor the performance and memory usage of your Delta Lake workload with different values of this configuration property to determine the optimal value for your specific use case.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group