Query related to Z ordering
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2025 02:52 AM
I have join on two large tables.
If i apply Z ordering on 3 columns for both the tables ,
I am joining two tables on the basis on the same 3 columns used for Z ordering ,
Will i get any benefit of Z ordering on performance when i use Joins here .
So as per my current knowledge Z ordering is specifically useful for faster query data retrieval i.e. select *from table1 where zordercolumn1 =X
Does Z ordering also helps Joins ? if not how to improve join performance here
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2025 02:57 AM
I have asked this query because w.r.t Predicate query push down to storage .
So here as part of joins . Data needs to be loaded into memory first and then join is performed . So How does Z order help here if it can't skip the data being fetched from storage .
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-04-2025 12:30 AM
hi @kasiviss42 indeed z-order helps in file level data skipping . Now answer is 2 fold.
1) if you're joining 2 dataframes then you can make use of data skipping a.k.a z-ordering by performing filters first on those dataframes which will be then used for joins. (predicate push down). your dataframes are already lighter and in-memory is optimized.
2) you can also specify merge predicate during your join if you're using delta native merge . This merge predicate consists of partition pruning predicate + data skipping predicate + join predicate. So the combination of predicate will be applied to filter down to relevant partitions , relevant files if they are z-ordered.
example:

