Topics with Label: Zorder

Forum Posts

Sorted by:

by qwerty1 • Contributor

05-26-2023 9:28:33 PM

367 Views
0 replies
0 kudos

Why am I not able to view all table properties?

We have a live streaming table created using the commandCREATE OR REFRESH STREAMING LIVE TABLE foo TBLPROPERTIES ( "pipelines.autoOptimize.zOrderCols" = "c1,, c2, c3, c4", "delta.randomizeFilePrefixes" = "true" );But when I run the show table propert...

Data Engineering

367 Views
0 replies
0 kudos

05-26-2023 9:28:33 PM

by hello_world • New Contributor III

12-28-2022 4:05:17 PM

2128 Views
3 replies
6 kudos

Resolved! What exactly is Z Ordering and Bloom Filter?

Have gone through the documentation, still cannot understand it.How is bloom filter indexing a column different from z ordering a column?Can somebody explain to me what exactly happens while these two techniques are applied?

Data Engineering

2128 Views
3 replies
6 kudos

12-28-2022 4:05:17 PM

View Replies

Latest Reply

Rishabh264
Honored Contributor II

12-29-2022 12:28:30 AM

6 kudos

hey @Daniel Sahal 1-A Bloomfilter index is a space-efficient data structure that enables data skipping on chosen columns, particularly for fields containing arbitrary textrefer this code snipet to create bloom filter CREATE BLOOMFILTER INDEX ON [TAB...

6 kudos

12-29-2022 12:28:30 AM

2 More Replies

by numersoz • New Contributor III

11-23-2022 7:05:12 PM

1915 Views
5 replies
10 kudos

Is ZORDER required after table overwrite?

Hi,After appending new values to a delta table, I need to delete duplicate rows.After deleting duplicate rows using PySpark, I overwrite the table (keeping the schema).My question is, do I have to do ZORDER again?Another question, is there another wa...

Data Engineering

1915 Views
5 replies
10 kudos

11-23-2022 7:05:12 PM

View Replies

Latest Reply

DeepakMakwana74
New Contributor III

11-27-2022 5:30:50 AM

10 kudos

Hii @Nurettin Ersoz try to use incremental load of data so it will avoid duplicate and you can use full load once if you have updation in your data

10 kudos

11-27-2022 5:30:50 AM

4 More Replies

by NOOR_BASHASHAIK • Contributor

10-22-2022 9:59:36 AM

2035 Views
5 replies
4 kudos

Azure Databricks VM type for OPTIMIZE with ZORDER on a single column

DearsI was trying to check what Azure Databricks VM type is best suited for executing OPTIMIZE with ZORDER on a single timestamp value (but string data type) column for around 5000+ tables in the Delta Lake.I chose Standard_F16s_v2 with 6 workers & 1...

Data Engineering

2035 Views
5 replies
4 kudos

10-22-2022 9:59:36 AM

View Replies

Latest Reply

Kaniz
Community Manager

10-25-2022 4:13:33 AM

4 kudos

Hi @NOOR BASHA SHAIK, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

4 kudos

10-25-2022 4:13:33 AM

4 More Replies

by leos1 • New Contributor II

10-13-2022 3:48:17 AM

777 Views
2 replies
0 kudos

Resolved! Question regarding ZORDER option of OPTIMIZE

Is the order of the columns in ZORDER important? For example, does ZORDER BY (product, site) and ZORDER BY (site, product) produce the same results?

Data Engineering

777 Views
2 replies
0 kudos

10-13-2022 3:48:17 AM

View Replies

Latest Reply

leos1
New Contributor II

10-13-2022 8:06:38 AM

0 kudos

thanks for the quick reply

0 kudos

10-13-2022 8:06:38 AM

1 More Replies

by Anonymous • Not applicable

06-18-2021 2:18:04 PM

1061 Views
1 replies
0 kudos

Resolved! What fields should I Zorder by? Does the order of Zorder matter?

Data Engineering

1061 Views
1 replies
0 kudos

06-18-2021 2:18:04 PM

View Replies

Latest Reply

User16826994223
Honored Contributor III

06-21-2021 5:51:45 AM

0 kudos

If you expect a column to be commonly used in query predicates and if that column has high cardinality (that is, a large number of distinct values), then use ZORDER BY.You can specify multiple columns for ZORDER BY as a comma-separated list. However,...

0 kudos

06-21-2021 5:51:45 AM

by User16826992666 • Valued Contributor

06-16-2021 8:52:14 PM

1184 Views
1 replies
0 kudos

Resolved! Should I use Z Ordering on my Delta table every time I run Optimize?

Wondering if it always makes sense or if there are some situations where you might only want to run optimize

Data Engineering

1184 Views
1 replies
0 kudos

06-16-2021 8:52:14 PM

View Replies

Latest Reply

Srikanth_Gupta_
Valued Contributor

06-17-2021 12:47:45 PM

0 kudos

Its good idea to optimize at end of each batch job to avoid any small files situation, Z order is optional and can be applied on few non partition columns which are used frequently in read operationsZORDER BY -> Colocate column information in the sam...

0 kudos

06-17-2021 12:47:45 PM