<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic What is Z-ordering in Delta and what are some best practices on using it? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/what-is-z-ordering-in-delta-and-what-are-some-best-practices-on/m-p/26639#M18665</link>
    <description>&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 28 May 2021 19:23:24 GMT</pubDate>
    <dc:creator>aladda</dc:creator>
    <dc:date>2021-05-28T19:23:24Z</dc:date>
    <item>
      <title>What is Z-ordering in Delta and what are some best practices on using it?</title>
      <link>https://community.databricks.com/t5/data-engineering/what-is-z-ordering-in-delta-and-what-are-some-best-practices-on/m-p/26639#M18665</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 28 May 2021 19:23:24 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/what-is-z-ordering-in-delta-and-what-are-some-best-practices-on/m-p/26639#M18665</guid>
      <dc:creator>aladda</dc:creator>
      <dc:date>2021-05-28T19:23:24Z</dc:date>
    </item>
    <item>
      <title>Re: What is Z-ordering in Delta and what are some best practices on using it?</title>
      <link>https://community.databricks.com/t5/data-engineering/what-is-z-ordering-in-delta-and-what-are-some-best-practices-on/m-p/26640#M18666</link>
      <description>&lt;P&gt;NiCely Written&lt;/P&gt;</description>
      <pubDate>Tue, 08 Jun 2021 07:50:41 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/what-is-z-ordering-in-delta-and-what-are-some-best-practices-on/m-p/26640#M18666</guid>
      <dc:creator>User16826994223</dc:creator>
      <dc:date>2021-06-08T07:50:41Z</dc:date>
    </item>
    <item>
      <title>Re: What is Z-ordering in Delta and what are some best practices on using it?</title>
      <link>https://community.databricks.com/t5/data-engineering/what-is-z-ordering-in-delta-and-what-are-some-best-practices-on/m-p/26641#M18667</link>
      <description>&lt;P&gt;Z-Ordering is a&amp;nbsp;&lt;A href="https://en.wikipedia.org/wiki/Z-order_curve" alt="https://en.wikipedia.org/wiki/Z-order_curve" target="_blank"&gt;technique&lt;/A&gt;&amp;nbsp;to colocate related information in the same set of files. This co-locality is automatically used by Delta Lake on Databricks data-skipping algorithms to dramatically reduce the amount of data that needs to be read. Syntax for Z-ordering can be found &lt;A href="https://docs.databricks.com/delta/optimizations/file-mgmt.html#data-skipping" alt="https://docs.databricks.com/delta/optimizations/file-mgmt.html#data-skipping" target="_blank"&gt;here&lt;/A&gt;. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If you expect a column to be commonly used in query predicates and if that column has high cardinality (that is, a large number of distinct values) which might make it ineffective for PARTITIONing the table by, then use&amp;nbsp;ZORDER&amp;nbsp;BY instead (ex:- a table containing companies, dates where you might want to partition by company and z-order by date assuming that table collects data for several years)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;You can specify multiple columns for&amp;nbsp;ZORDER&amp;nbsp;BY as a comma-separated list. However, the effectiveness of the locality drops with each additional column. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Important to note that you need statistics collected on columns that you Z-order by else data skipping won't take effect. Thus its important to reorder the table such that the Z-order by column(s) are in one of the first 32 columns or change the dataSkippingNumIndexedCols property&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;And if you learn best through visuals &lt;A href="https://www.youtube.com/watch?v=A1aR1A8OwOU" alt="https://www.youtube.com/watch?v=A1aR1A8OwOU" target="_blank"&gt;this is a great explainer video &lt;/A&gt;on Z-ordering on Delta Tables&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 20 Jun 2021 03:25:11 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/what-is-z-ordering-in-delta-and-what-are-some-best-practices-on/m-p/26641#M18667</guid>
      <dc:creator>aladda</dc:creator>
      <dc:date>2021-06-20T03:25:11Z</dc:date>
    </item>
  </channel>
</rss>

