cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Dynamic Bloom Filters for Inner Joins

tomvogel01
New Contributor II

I have a question regarding combining the use of Bloom filters with Liquid Clustering to further reduce the data read during a join/merge on top of dynamic file pruning. Testing both combined worked extremely well together for point queries. However having Bloom filters on a table removed dynamic file pruning entirely and lead to the entire table being read when doing a join/merge with and without Photon.

Do Bloom filters work along side dynamic file pruning? If so, any thoughts as to what might be going wrong?

Is there a plan to support such a functionality if not? If would be amazing to have it as it reduced the amount of data read by a factor of 20.

2 REPLIES 2

Could you point me to the specific online resources that speak of this? My research has yielded very little in terms of guidance which is why I am reaching out here.

NandiniN
Databricks Employee
Databricks Employee

We do not recommend Bloom filters Index on the Delta Tables as they have to be manually maintained. 

If you prefer photon - please try predictive I/O with Liquid Clustering.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now