cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

When should I use a bloom filter index vs. a z-order index? Any best practices around it?

User16826992724
New Contributor III
 
1 REPLY 1

User16826992724
New Contributor III

Just like B-tree indices in the traditional EDW world, Z-order indexing can be used on high-cardinality columns like Primary Key columns and high-cardinality joins like facts and dimension tables joins. Z-order indexes can be created only on the first 32 columns of a table.

  1. Bloom filter index - can be used where z-order index cannot be used ( meaning columns on wide tables with more than 32 columns- for e.g. columns at the end of the tables like MM,YYYY, datetime, quarter etc.
  2. Bloom filters are also be used for less-used low or high cardinality fields that users typically use in WHERE conditions. ( You don't want to use too many columns in one z-ordex index as its efficiency may reduce).
  3. Bloom filter index can be used as a mechanism to avoid over-partitioning. In Databricks, we donโ€™t recommend over-partition - each partition should have around 1 GB data (compressed) at least, In such cases, when the table is not too big and you don't want to partition, putting a bloom filter index on date or yyyy-mm columns - can achieve the benefits of faster queries when a predicate is put on the date columns. Similarly, if partitioning is done on the Year column then a bloom filter on Month-Year can help.
  4. Even when a column has a z-order Index, we can put an additional  bloom filter index  on it for better performance.
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.