Hi,
I have created 2 identical tables but one is partitioned and the one is a Liquid Clustered with Auto Clustering.
I inserted 30M rows x 2 (60M) for two dates , date 1 = 2026-06-01 and date = 2026-06-02 , then I overwrite the date 2026-06-02 with a selective overwrite statement
For partition table , history shows ,
| operationParameters | |
| | object |
| operationMetrics | |
| | objectnumRemovedBytes: "6090798096" numDeletionVectorsRemoved: "0" numOutputRows: "30000000" numOutputBytes: "6101142740"
|
For Liquid Clustered table
| operationParameters | |
| | objectclusteringOnWriteStatus: null replaceUsingCols: "(BED)"
|
| operationMetrics | |
| | objectnumRemovedBytes: "5779903887" numDeletionVectorsAdded: "0" numDeletedRows: "30000000" numDeletionVectorsRemoved: "0" numOutputRows: "30000000" numOutputBytes: "5779715681"
|
Its overwriting 43 files (Total 86) ?, is this optimal ?
Is there a way to improve performance by reducing the # of files ?
I am using
INSERT INTO <target> REPLACE USING (col)
SELECT <cols> FROM <table>