cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How can I use data skipping with Delta Lake

Srikanth_Gupta_
Valued Contributor

How does data skipping work with delta lake, can I run ANALYZE TABLE COMPUTE STATISTICS with Delta lake? or Zorder going to solve these problems?

2 REPLIES 2

sajith_appukutt
Honored Contributor II

You do not need to configure data skipping for delta lake, it would be used whenever applicable.

The effectiveness of data skipping would depend on the layout and you could apply Z-Ordering for best results.

Anonymous
New Contributor III

You can use Zorder with indexes for data skipping. Data skipping information is collected automatically when you write to delta table. 
Delta lake uses this information to provide faster query.

You dont need to configure anything for data skipping as this feature is activated when applicable. However, the effectiveness depends on the layout of the data. By default Delta Lake collects statistics on the first 32 columns (which can be changed using the property the delta.dataSkippingNumIndexedCols Adding more columns would add more overhead as you write files.

Collecting statistics on long strings is an expensive operation. We should avoid that by not collecting statistics on long strings. You can either configure the table property delta.dataSkippingNumIndexedCols  to avoid such columns or move such columns containing to a column greater than delta.dataSkippingNumIndexedCols

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group