What's the difference between Z-Ordering and Partitioning?

User16790091296 — Fri, 28 May 2021 19:22:29 GMT

Re: What's the difference between Z-Ordering and Partitioning?

sajith_appukutt — Thu, 24 Jun 2021 22:02:47 GMT

Partitioning is a way of distributing the data by keys so that you can restrict the amount of data scanned by each query and improve performance / avoid conflicts

General rules of thumb for choosing the right partition columns

Cardinality of a column should not be very high
Amount of data in each partition should meet a minimum threshold

Now delta supports a feature called data skipping to speed up queries .

Z-odering is a multi-dimensional clustering approach to colocate related information in the same set of files so that databricks data-skipping algorithms can dramatically reduce the amount of data that needs to be read. This works somewhat like secondary indexes in terms of improving query read performance.

topic Re: What's the difference between Z-Ordering and Partitioning? in Data Engineering

What's the difference between Z-Ordering and Partitioning?

Re: What's the difference between Z-Ordering and Partitioning?