<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: What exactly is Z Ordering and Bloom Filter? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/what-exactly-is-z-ordering-and-bloom-filter/m-p/14231#M8758</link>
    <description>&lt;P&gt;Bloom filter is like a looking for a needle in the haystack (with FPP), so it's more useful for strings.&lt;/P&gt;&lt;P&gt;Z-Order is best with a couple of columns that are used for filters/joins.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;They can run independently of each other or work together.&lt;/P&gt;&lt;P&gt;See example here:&lt;/P&gt;&lt;P&gt;&lt;A href="https://www.mssqltips.com/sqlservertip/6968/bloom-filter-indexes-using-databricks-delta/" target="test_blank"&gt;https://www.mssqltips.com/sqlservertip/6968/bloom-filter-indexes-using-databricks-delta/&lt;/A&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 29 Dec 2022 06:28:55 GMT</pubDate>
    <dc:creator>daniel_sahal</dc:creator>
    <dc:date>2022-12-29T06:28:55Z</dc:date>
    <item>
      <title>What exactly is Z Ordering and Bloom Filter?</title>
      <link>https://community.databricks.com/t5/data-engineering/what-exactly-is-z-ordering-and-bloom-filter/m-p/14230#M8757</link>
      <description>&lt;P&gt;Have gone through the documentation, still cannot understand it.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;How is bloom filter indexing a column different from z ordering a column?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Can somebody explain to me what exactly happens while these two techniques are applied?&lt;/P&gt;</description>
      <pubDate>Thu, 29 Dec 2022 00:05:17 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/what-exactly-is-z-ordering-and-bloom-filter/m-p/14230#M8757</guid>
      <dc:creator>hello_world</dc:creator>
      <dc:date>2022-12-29T00:05:17Z</dc:date>
    </item>
    <item>
      <title>Re: What exactly is Z Ordering and Bloom Filter?</title>
      <link>https://community.databricks.com/t5/data-engineering/what-exactly-is-z-ordering-and-bloom-filter/m-p/14231#M8758</link>
      <description>&lt;P&gt;Bloom filter is like a looking for a needle in the haystack (with FPP), so it's more useful for strings.&lt;/P&gt;&lt;P&gt;Z-Order is best with a couple of columns that are used for filters/joins.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;They can run independently of each other or work together.&lt;/P&gt;&lt;P&gt;See example here:&lt;/P&gt;&lt;P&gt;&lt;A href="https://www.mssqltips.com/sqlservertip/6968/bloom-filter-indexes-using-databricks-delta/" target="test_blank"&gt;https://www.mssqltips.com/sqlservertip/6968/bloom-filter-indexes-using-databricks-delta/&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 29 Dec 2022 06:28:55 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/what-exactly-is-z-ordering-and-bloom-filter/m-p/14231#M8758</guid>
      <dc:creator>daniel_sahal</dc:creator>
      <dc:date>2022-12-29T06:28:55Z</dc:date>
    </item>
    <item>
      <title>Re: What exactly is Z Ordering and Bloom Filter?</title>
      <link>https://community.databricks.com/t5/data-engineering/what-exactly-is-z-ordering-and-bloom-filter/m-p/14232#M8759</link>
      <description>&lt;P&gt;hey @Daniel Sahal​&amp;nbsp;&lt;/P&gt;&lt;P&gt;1-A&amp;nbsp;&lt;A href="https://en.wikipedia.org/wiki/Bloom_filter" alt="https://en.wikipedia.org/wiki/Bloom_filter" target="_blank"&gt;Bloomfilter&lt;/A&gt;&amp;nbsp;index is a space-efficient data structure that enables data skipping on chosen columns, particularly for fields containing arbitrary text&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;refer this code snipet to create bloom filter &lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;CREATE BLOOMFILTER INDEX
ON [TABLE] table_name
[FOR COLUMNS( { columnName1 [ options ] } [, ...] ) ]
[ options ]
&amp;nbsp;
options
  OPTIONS ( { key1 [ = ] val1 } [, ...] )&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;2-Z-ordering is a &lt;A href="https://en.wikipedia.org/wiki/Z-order_curve" alt="https://en.wikipedia.org/wiki/Z-order_curve" target="_blank"&gt;technique&lt;/A&gt;&amp;nbsp;to colocate related information in the same set of files. This co-locality is automatically used by Delta Lake on Databricks data-skipping algorithms. This behavior dramatically reduces the amount of data that Delta Lake on Databricks needs to read. To Z-order data, you specify the columns to order on in the&amp;nbsp;&lt;/P&gt;&lt;P&gt;ZORDER&amp;nbsp;BY&amp;nbsp;clause:&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;OPTIMIZE events
WHERE date &amp;gt;= current_timestamp() - INTERVAL 1 day
ZORDER BY (eventType)&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 29 Dec 2022 08:28:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/what-exactly-is-z-ordering-and-bloom-filter/m-p/14232#M8759</guid>
      <dc:creator>Rishabh-Pandey</dc:creator>
      <dc:date>2022-12-29T08:28:30Z</dc:date>
    </item>
    <item>
      <title>Re: What exactly is Z Ordering and Bloom Filter?</title>
      <link>https://community.databricks.com/t5/data-engineering/what-exactly-is-z-ordering-and-bloom-filter/m-p/14233#M8760</link>
      <description>&lt;P&gt;In the example, bloom filter is also used for filters.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;How do we decide the columns to be indexed and z ordered? Based on data type String or Non-String?&lt;/P&gt;</description>
      <pubDate>Thu, 29 Dec 2022 10:12:06 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/what-exactly-is-z-ordering-and-bloom-filter/m-p/14233#M8760</guid>
      <dc:creator>hello_world</dc:creator>
      <dc:date>2022-12-29T10:12:06Z</dc:date>
    </item>
  </channel>
</rss>

