Databricks Community

Mr__D · ‎04-04-2023

Hello All,

Could any one please suggest what is the best way to populate(Upsert) data from delta table into the sql server table.

we are transforming our data in Databricks and storing data into the delta table. but for reporting purpose we need to populate the data in Sql server table as well.

what is the best way to to do the data Upsert from delta table to Sql server ?

does spark synapse connector will be the the best option ?

Thanks,

Deepak

Anonymous · ‎04-06-2023

@Deepak Bhatt :

Yes, using the Spark Synapse connector could be a good option for upserting data from a Delta table into a SQL Server table. The Spark Synapse connector allows you to read and write data from Azure Synapse Analytics, formerly known as SQL Data Warehouse, and it supports both read and write operations.

To upsert data, you can first read the data from the Delta table using Spark's DataFrame API, and then write the data to the SQL Server table using the Synapse connector's write method. To perform an upsert, you can use the MERGE statement in SQL Server.

Here's some sample code that demonstrates how to use the Spark Synapse connector to upsert data from a Delta table into a SQL Server table:

# Read data from Delta table
delta_df = spark.read.format("delta").load("/path/to/delta/table")
 
# Write data to SQL Server table
delta_df.write \
    .format("com.microsoft.sqlserver.jdbc.spark") \
    .mode("overwrite") \
    .option("url", "jdbc:sqlserver://<your_server>:<your_port>;database=<your_database>") \
    .option("dbtable", "<your_table>") \
    .option("user", "<your_username>") \
    .option("password", "<your_password>") \
    .option("tableLock", "true") \
    .option("bulkCopyBatchSize", "100000") \
    .option("bulkCopyTableLock", "true") \
    .option("bulkCopyTimeout", "600") \
    .option("batchsize", "100000") \
    .option("reliabilityLevel", "BEST_EFFORT") \
    .option("tempDir", "/mnt/temp") \
    .option("checkpointLocation", "/mnt/checkpoint") \
    .option("mergeSchema", "true") \
    .option("createTableColumnTypes", "<column_types>") \
    .option("spark.connection.mode", "databricks") \
    .option("spark.databricks.delta.preview.enabled", "true") \
    .option("synapseStorageAccountUrl", "<your_storage_account_url>") \
    .option("synapseStorageAccountAccessKey", "<your_storage_account_access_key>") \
    .option("synapseLinkedServiceName", "<your_linked_service_name>") \
    .option("synapseWorkspaceName", "<your_workspace_name>") \
    .option("upsert", "true") \
    .option("primaryKey", "<your_primary_key>") \
    .save()

Make sure to replace the placeholders with your own values for the SQL Server connection and table details. Also, set the upsert option to true and specify the primary key column(s) using the primaryKey option to perform an upsert.

Note that you may need to adjust the other options based on your specific requirements and environment.

Databricks Community

Populating data from databricks to sql server tables

Join Us as a Local Community Builder!

Lakehouse, Lagers & Legends — Bangalore Meetup | December 13

🌟 Community Pulse: Your Weekly Roundup! November 21 – 27, 2025

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐