<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Z-Ordering Timestamp Column in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/z-ordering-timestamp-column/m-p/17946#M11862</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I've large Delta Table for IoT data for over 10K different sensors with timestamp, sensor name and value columns at 1 second precision.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Query pattern is usually random 5-100 sensors at a time. But typically involves specific year/month/day interval.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This table grows with Auto Loader every 4 hours, via appending pre-landing parquet files to Delta Table.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I'm planning to create year, year-month, year-month-day columns and Z-Order by these to improve query time.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My question is, when SQL query is based on using the timestamp column, will it take advantage of Z-Ordering? Or do I have to specify the calculated columns (year, year-month, year-month-day) with a where clause?&lt;/P&gt;</description>
    <pubDate>Thu, 08 Dec 2022 03:55:32 GMT</pubDate>
    <dc:creator>numersoz</dc:creator>
    <dc:date>2022-12-08T03:55:32Z</dc:date>
    <item>
      <title>Z-Ordering Timestamp Column</title>
      <link>https://community.databricks.com/t5/data-engineering/z-ordering-timestamp-column/m-p/17946#M11862</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I've large Delta Table for IoT data for over 10K different sensors with timestamp, sensor name and value columns at 1 second precision.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Query pattern is usually random 5-100 sensors at a time. But typically involves specific year/month/day interval.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This table grows with Auto Loader every 4 hours, via appending pre-landing parquet files to Delta Table.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I'm planning to create year, year-month, year-month-day columns and Z-Order by these to improve query time.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;My question is, when SQL query is based on using the timestamp column, will it take advantage of Z-Ordering? Or do I have to specify the calculated columns (year, year-month, year-month-day) with a where clause?&lt;/P&gt;</description>
      <pubDate>Thu, 08 Dec 2022 03:55:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/z-ordering-timestamp-column/m-p/17946#M11862</guid>
      <dc:creator>numersoz</dc:creator>
      <dc:date>2022-12-08T03:55:32Z</dc:date>
    </item>
    <item>
      <title>Re: Z-Ordering Timestamp Column</title>
      <link>https://community.databricks.com/t5/data-engineering/z-ordering-timestamp-column/m-p/17947#M11863</link>
      <description>&lt;P&gt;Hi @Nurettin Ersoz​&amp;nbsp;, you have to specify the column you are using for Z-Ordering in the SQL query.&lt;/P&gt;&lt;P&gt;It seems like you want to specify multiple columns for&amp;nbsp;ZORDER&amp;nbsp;BY as a comma-separated list. But the effectiveness of the locality drops with each additional column. So, the performance may not be better.&lt;/P&gt;&lt;P&gt;&lt;A href="https://www.youtube.com/watch?v=A1aR1A8OwOU" target="test_blank"&gt;https://www.youtube.com/watch?v=A1aR1A8OwOU&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Above video might help you. Have a look at it.&lt;/P&gt;</description>
      <pubDate>Thu, 08 Dec 2022 09:55:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/z-ordering-timestamp-column/m-p/17947#M11863</guid>
      <dc:creator>Geeta1</dc:creator>
      <dc:date>2022-12-08T09:55:04Z</dc:date>
    </item>
    <item>
      <title>Re: Z-Ordering Timestamp Column</title>
      <link>https://community.databricks.com/t5/data-engineering/z-ordering-timestamp-column/m-p/17949#M11865</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you this helps!&lt;/P&gt;</description>
      <pubDate>Thu, 08 Dec 2022 14:11:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/z-ordering-timestamp-column/m-p/17949#M11865</guid>
      <dc:creator>numersoz</dc:creator>
      <dc:date>2022-12-08T14:11:13Z</dc:date>
    </item>
    <item>
      <title>Re: Z-Ordering Timestamp Column</title>
      <link>https://community.databricks.com/t5/data-engineering/z-ordering-timestamp-column/m-p/38721#M26731</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/78016"&gt;@numersoz&lt;/a&gt;&amp;nbsp;did you z-order on the timestamp column or on less granular columns, like Year, Month, or Day. timestamp column is very granular (high cardinality) since it also includes hour, minute, second...&lt;/P&gt;</description>
      <pubDate>Sat, 29 Jul 2023 22:31:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/z-ordering-timestamp-column/m-p/38721#M26731</guid>
      <dc:creator>Oliver_Angelil</dc:creator>
      <dc:date>2023-07-29T22:31:31Z</dc:date>
    </item>
  </channel>
</rss>

