cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Handling partition overwrite in Liquid Clustering

Klusener
Contributor

Hello,

Currently we have delta tables in TBs partitioned by year, month, day. We perform dynamic partition overwrite using partitionOverwriteMode  as dynamic to handle rerun/corrections.

With liquid clustering, since explicit partitions are not required, how should we handle overwrite scenarios? Using MERGE is not feasible in our case, as the data consists of large log files with 100โ€“1000 attributes.

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions

Saritha_S
Databricks Employee
Databricks Employee

Hi @Klusener 

Good day!!

Dynamic partition overwrites only supports selective overwrites for partitioned columns, not for liquid clustering or regular columns.

If you know the exact predicates, use replaceWhere. Note: This is not possible without knowing the predicates 

Reference doc: https://docs.databricks.com/aws/en/delta/selective-overwrite#arbitrary-selective-overwrite-with-repl...

To support this feature, we have a feature request, and below is the link for reference, which can be tracked in the future.

Reference: https://databricks.aha.io/ideas/ideas/DB-I-10561

There is no ETA for the feature request currently. Our PM team will review the feature request and will prioritize it accordingly.

Please let us know if you need any further assistance or questions on this.

View solution in original post

1 REPLY 1

Saritha_S
Databricks Employee
Databricks Employee

Hi @Klusener 

Good day!!

Dynamic partition overwrites only supports selective overwrites for partitioned columns, not for liquid clustering or regular columns.

If you know the exact predicates, use replaceWhere. Note: This is not possible without knowing the predicates 

Reference doc: https://docs.databricks.com/aws/en/delta/selective-overwrite#arbitrary-selective-overwrite-with-repl...

To support this feature, we have a feature request, and below is the link for reference, which can be tracked in the future.

Reference: https://databricks.aha.io/ideas/ideas/DB-I-10561

There is no ETA for the feature request currently. Our PM team will review the feature request and will prioritize it accordingly.

Please let us know if you need any further assistance or questions on this.