cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Handling Concurrent Writes to a Delta Table by delta-rs and Databricks Spark Job

prem14f
New Contributor II

Hi @dennyglee@Retired_mod.

If I am writing data into a Delta table using delta-rs and a Databricks job, but I lose some transactions, how can I handle this?

Given that Databricks runs a commit service and delta-rs uses DynamoDB for transaction logs, how can we handle concurrent writers from Databricks jobs and delta-rs writers on the same table?

1 REPLY 1

Retired_mod
Esteemed Contributor III

Hi @prem14f, To manage lost transactions, implement retry logic with automatic retries and ensure idempotent writes to avoid duplication. For concurrent writers, use optimistic concurrency control, which allows for conflict detection and resolution during commits, partition your Delta table to reduce conflict likelihood, and ensure proper configuration and access to transaction logs. An example implementation in Databricks involves retrying writes with a delay if failures occur. Additionally, set up monitoring, alerts, and conflict resolution strategies to address issues promptly. 

Is there a specific part of this process you’d like to dive deeper into?

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now