Databricks Community

swzzzsw · ‎01-24-2022

I'm using databricks to connect to a SQL managed instance via JDBC. SQL operations I need to perform include DELETE, UPDATE, and simple read and write. Since spark syntax only handles simple read and write, I had to open SQL connection using Scala and perform DELETE and UPDATE queries.

Here's a sample scala code I use to execute delete queries:

val connection = DriverManager.getConnection(jdbcUrl, jdbcUsername, jdbcPassword)
 
val statement = connection.createStatement()
 
val queryStr = "DELETE FROM SAMPLE"
 
val res = stmt.execute(queryStr)
 
connection.close()

These lines work perfectly fine if I run one notebook at a time. However, when I run several notebooks in parallel, I can get into deadlock issues (see below)

How can I resolve this error?

-werners- · ‎01-25-2022

the issue is not your code but the fact that you run the queries in parallel. The SQL server database cannot handle that for some reason.

f.e. one notebook run is doing an update while another wants to delete that record.

View solution in original post

-werners- · ‎01-25-2022

this is not a spark error but purely the database.

There are tons of articles online on how to prevent deadlocks, but there is no single solution for this.

swzzzsw · ‎01-25-2022

I'm not a fluent Scala user. Do you happen to know one solution that deals with JDBC in Scala?

-werners- · ‎01-25-2022

the issue is not your code but the fact that you run the queries in parallel. The SQL server database cannot handle that for some reason.

f.e. one notebook run is doing an update while another wants to delete that record.

swzzzsw · ‎01-25-2022

Got it! Thank you so much! It looks like I can use error handling to rerun the deadlock victim until it works. Thanks for pointing me to the right direction!

Dineshvishe · ‎10-20-2024

You have manged time and stp step need person operation so deadlock can avoided. This is purly database proplem which can avoid in making time difference or short transcation in database operation.

Panda · ‎10-21-2024

@swzzzsw
Since you are performing database operations, to reduce the chances of deadlocks, make sure to wrap your SQL operations inside transactions using commit and rollback.

Another approachs to consider is adding retry logic or using Isolation Levels. For more information, refer to the Databricks documentation on isolation levels ( Isolation Levels Documentation )

Databricks Community

SQLServerException: deadlock

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Milestone: DatabricksTV Reaches 100 Videos!

Announcing the new Meta Llama 3.3 model on Databricks

Databricks Community Champion - December 2024 - Sujesh Menon

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences