cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

ACID properties in delta?

sriradh
New Contributor

How are locks maintained within a Delta Lake? For instance, lets say there are 2 simple tables, customer_details and say orders. Lets say I am running a job that will say insert an order in the orders table for say $100 for a specific customerId, it should go and update (increment) the customer_details table with the order_count value by 1 and also update the order_value details by 100. Note that until the the orders table is fully updated with all the information, the customer_details table should not be updated and also, once the orders table is inserted/deleted, the customer_details table HAS to be updated with the right counts and dollars. In a traditional DB, we have this concept of savepoints where we can combine multiple CRUD operations as a 'transaction' and either fail (rollback?) everything or commit everything to the DB. How is this possible in a delta environment? While ACID capabilities exist at an individual table level, how can this be achieved in a delta lake ? (Kindly note that updating the customer_details table after the fact as a batch job is a solution but this is just a simple use case I have posted. There is a good chance that an "order" can also require data to be stored in multiple tables). Thanks in advance..

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz_Fatma
Community Manager
Community Manager

Hi @sriradh

In Delta Lake, ACID transaction guarantees are provided between reads and writes. This means multiple writers across multiple clusters can modify a table partition simultaneously. Writers see a consistent snapshot view of the table, and writes in serial order. Readers continue to see a consistent snapshot view of the table that the Databricks job started with...

Delta Lake on Databricks supports two isolation levels: Serializable and WriteSerializable. Serializable is the strongest isolation level. It ensures that committed write operations and all reads are Serializable.

Delta Lake provides row-level concurrency to reduce conflicts between concurrent write operations by detecting changes at the row level. Row-level concurrency automatically resolves competing changes in concurrent writes that update or delete different rows in the same data file. Row-level concurrency provides the following enhancements for concurrent transactions:

You can set the isolation level and avoid conflicts using partitioning and disjoint command conditions. Conflict exceptions can also be used to handle conflicts.

I hope this helps!

View solution in original post

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @sriradh

In Delta Lake, ACID transaction guarantees are provided between reads and writes. This means multiple writers across multiple clusters can modify a table partition simultaneously. Writers see a consistent snapshot view of the table, and writes in serial order. Readers continue to see a consistent snapshot view of the table that the Databricks job started with...

Delta Lake on Databricks supports two isolation levels: Serializable and WriteSerializable. Serializable is the strongest isolation level. It ensures that committed write operations and all reads are Serializable.

Delta Lake provides row-level concurrency to reduce conflicts between concurrent write operations by detecting changes at the row level. Row-level concurrency automatically resolves competing changes in concurrent writes that update or delete different rows in the same data file. Row-level concurrency provides the following enhancements for concurrent transactions:

You can set the isolation level and avoid conflicts using partitioning and disjoint command conditions. Conflict exceptions can also be used to handle conflicts.

I hope this helps!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group