cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Identity column and impact on performance

norbitek
New Contributor II

Hi,

I want to define identity column in the Delta table.

Based on documentation:

"Declaring an identity column on a Delta table disables concurrent transactions. Only use identity columns in use cases where concurrent writes to the target table are not required."

For me this is clear in the situation where I want to write data to the target table from different sessions in the same time but does it concern also situation where I execute MERGE statement to UPSERT/DELETE data from one session only?

I have no idea how MERGE statements works internally.

Maybe it uses concurrent writes but enabling identity column on target table data will be written sequentially.

1 REPLY 1

Sidhant07
Databricks Employee
Databricks Employee
  • The use of an identity column in a Delta table affects the execution of a MERGE statement by disabling concurrent transactions. This constraint means that when performing operations such as upserting or deleting data, the identity column enforces that the operations must be executed sequentially, even if they originate from a single session. This is because identity columns in Delta Lake are designed to assign unique values to each record, which requires sequential processing to ensure uniqueness.
    โ€ข As a result, the identity column constraint impacts the internal workings of a MERGE statement by potentially enforcing sequential writes, even when the operation is performed from a single session. This design choice is necessary to maintain the integrity and uniqueness of the identity values assigned to records.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group