cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Row-level Concurrency and Liquid Clustering compatibility

JasonThomas
New Contributor III

The documentation is a little ambiguous:

"Row-level concurrency is only supported on tables without partitioning, which includes tables with liquid clustering."

https://docs.databricks.com/en/release-notes/runtime/14.2.html

 Tables with liquid clustering enabled support row-level concurrency in Databricks Runtime 13.3 LTS and above. Row-level concurrency is generally available in Databricks Runtime 14.2 and above for all tables with deletion vectors enabled.

https://docs.databricks.com/en/delta/clustering.html

Also, is there a method to enable cluster-on-write for MERGE INTO statements?

Most operations do not automatically cluster data on write. Operations that cluster on write include the following:

  • INSERT INTO operations

  • CTAS statements

  • COPY INTO from Parquet format

  • spark.write.format("delta").mode("append")

2 REPLIES 2

SparkJun
Databricks Employee
Databricks Employee

It is recommanded to use the DBR 14.2 or above for its default row-level concurrency support. Since there isn't a way to just enable cluster-on-write during MERGE INTO statements. You can consider clustering the source data before merging it. 

JasonThomas
New Contributor III

Cluster-on-write is something being worked on. The limitations at the moment have to do with accommodating streaming workloads.

I found the following informative:

https://www.youtube.com/watch?v=5t6wX28JC_M

Join us on Thursday, December 7 at 10AM PST for an enlightening session on Delta Lake's Liquid Clustering, a transformative approach in data management and optimization with Vítor Teixeira, Senior Data Engineer at Veeva Systems. Liquid Clustering is Delta Lake's answer to the complex challenges of

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group