- 367 Views
- 0 replies
- 0 kudos
We have a live streaming table created using the commandCREATE OR REFRESH STREAMING LIVE TABLE foo
TBLPROPERTIES ( "pipelines.autoOptimize.zOrderCols" = "c1,, c2, c3, c4", "delta.randomizeFilePrefixes" = "true" );But when I run the show table propert...
- 367 Views
- 0 replies
- 0 kudos
- 2128 Views
- 3 replies
- 6 kudos
Have gone through the documentation, still cannot understand it.How is bloom filter indexing a column different from z ordering a column?Can somebody explain to me what exactly happens while these two techniques are applied?
- 2128 Views
- 3 replies
- 6 kudos
Latest Reply
hey @Daniel Sahal​ 1-A Bloomfilter index is a space-efficient data structure that enables data skipping on chosen columns, particularly for fields containing arbitrary textrefer this code snipet to create bloom filter CREATE BLOOMFILTER INDEX
ON [TAB...
2 More Replies
- 1915 Views
- 5 replies
- 10 kudos
Hi,After appending new values to a delta table, I need to delete duplicate rows.After deleting duplicate rows using PySpark, I overwrite the table (keeping the schema).My question is, do I have to do ZORDER again?Another question, is there another wa...
- 1915 Views
- 5 replies
- 10 kudos
Latest Reply
Hii @Nurettin Ersoz​ try to use incremental load of data so it will avoid duplicate and you can use full load once if you have updation in your data
4 More Replies
- 2035 Views
- 5 replies
- 4 kudos
- 2035 Views
- 5 replies
- 4 kudos
Latest Reply
Hi @NOOR BASHA SHAIK​​, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.
4 More Replies
by
leos1
• New Contributor II
- 777 Views
- 2 replies
- 0 kudos
Is the order of the columns in ZORDER important? For example, does ZORDER BY (product, site) and ZORDER BY (site, product) produce the same results?
- 777 Views
- 2 replies
- 0 kudos
- 1184 Views
- 1 replies
- 0 kudos
Wondering if it always makes sense or if there are some situations where you might only want to run optimize
- 1184 Views
- 1 replies
- 0 kudos
Latest Reply
Its good idea to optimize at end of each batch job to avoid any small files situation, Z order is optional and can be applied on few non partition columns which are used frequently in read operationsZORDER BY -> Colocate column information in the sam...