Re: Spark JDBC Write Fails for Record Not Present ...

ManojkMohan · ‎10-28-2025

@jorperort When writing to SQL Server tables with composite primary keys from Databricks using JDBC, unique constraint violations are often caused by Spark’s distributed retry logic https://docs.databricks.com/aws/en/archive/connectors/jdbc

Solutions
Write to Staging Table and Use MERGE:
The recommended way is to always route batch writes to a temporary or staging table in SQL Server, then execute a database-level MERGE (upsert)

Tune Write Parallelism:
Adjust numPartitions, batchsize, and manage transaction isolation through JDBC to minimize retry issues. See official options and guidance on parallelism

https://docs.databricks.com/aws/en/archive/connectors/jdbc#control-parallelism-for-jdbc-queries

Validate DataFrame for Duplicates:
Always invoke .dropDuplicates([PK columns]) on the DataFrame before write.
https://docs.databricks.com/aws/en/archive/connectors/jdbc

SQL Server’s “IGNORE_DUP_KEY” option can sometimes help, but since yours is OFF, conflicts are not ignored.Databricks guidance on JDBC driver
https://docs.databricks.com/aws/en/ingestion/lakeflow-connect/sql-server-source-setup