<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: When should I create a Bloom Filter Index on my Delta table? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/when-should-i-create-a-bloom-filter-index-on-my-delta-table/m-p/23483#M16208</link>
    <description>&lt;P&gt;A&amp;nbsp;bloom filter&amp;nbsp;index is a space-efficient data structure that enables data skipping on chosen columns, particularly for fields containing arbitrary text. The Bloom filter operates by either stating that data is definitively&amp;nbsp;&lt;I&gt;not in&lt;/I&gt;&amp;nbsp;the file, or that it is&amp;nbsp;&lt;I&gt;probably in&lt;/I&gt;&amp;nbsp;the file, with a defined false positive probability (FPP). &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The biggest reason for using a bloom filter when you often query on a specific set of columns. An example use case is when you have a large table and try to query a small subset of the data, which helps in “needle in a haystack” queries. &lt;/P&gt;</description>
    <pubDate>Fri, 18 Jun 2021 00:00:40 GMT</pubDate>
    <dc:creator>Ryan_Chynoweth</dc:creator>
    <dc:date>2021-06-18T00:00:40Z</dc:date>
    <item>
      <title>When should I create a Bloom Filter Index on my Delta table?</title>
      <link>https://community.databricks.com/t5/data-engineering/when-should-i-create-a-bloom-filter-index-on-my-delta-table/m-p/23482#M16207</link>
      <description />
      <pubDate>Thu, 17 Jun 2021 03:57:38 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/when-should-i-create-a-bloom-filter-index-on-my-delta-table/m-p/23482#M16207</guid>
      <dc:creator>User16826992666</dc:creator>
      <dc:date>2021-06-17T03:57:38Z</dc:date>
    </item>
    <item>
      <title>Re: When should I create a Bloom Filter Index on my Delta table?</title>
      <link>https://community.databricks.com/t5/data-engineering/when-should-i-create-a-bloom-filter-index-on-my-delta-table/m-p/23483#M16208</link>
      <description>&lt;P&gt;A&amp;nbsp;bloom filter&amp;nbsp;index is a space-efficient data structure that enables data skipping on chosen columns, particularly for fields containing arbitrary text. The Bloom filter operates by either stating that data is definitively&amp;nbsp;&lt;I&gt;not in&lt;/I&gt;&amp;nbsp;the file, or that it is&amp;nbsp;&lt;I&gt;probably in&lt;/I&gt;&amp;nbsp;the file, with a defined false positive probability (FPP). &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The biggest reason for using a bloom filter when you often query on a specific set of columns. An example use case is when you have a large table and try to query a small subset of the data, which helps in “needle in a haystack” queries. &lt;/P&gt;</description>
      <pubDate>Fri, 18 Jun 2021 00:00:40 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/when-should-i-create-a-bloom-filter-index-on-my-delta-table/m-p/23483#M16208</guid>
      <dc:creator>Ryan_Chynoweth</dc:creator>
      <dc:date>2021-06-18T00:00:40Z</dc:date>
    </item>
  </channel>
</rss>

