cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Populating data from databricks to sql server tables

Mr__D
New Contributor II

Hello All,

Could any one please suggest what is the best way to populate(Upsert) data from delta table into the sql server table.

we are transforming our data in Databricks and storing data into the delta table. but for reporting purpose we need to populate the data in Sql server table as well.

what is the best way to to do the data Upsert from delta table to Sql server ?

does spark synapse connector will be the the best option ?

Thanks,

Deepak

1 REPLY 1

Anonymous
Not applicable

@Deepak Bhatt​ :

Yes, using the Spark Synapse connector could be a good option for upserting data from a Delta table into a SQL Server table. The Spark Synapse connector allows you to read and write data from Azure Synapse Analytics, formerly known as SQL Data Warehouse, and it supports both read and write operations.

To upsert data, you can first read the data from the Delta table using Spark's DataFrame API, and then write the data to the SQL Server table using the Synapse connector's write method. To perform an upsert, you can use the MERGE statement in SQL Server.

Here's some sample code that demonstrates how to use the Spark Synapse connector to upsert data from a Delta table into a SQL Server table:

# Read data from Delta table
delta_df = spark.read.format("delta").load("/path/to/delta/table")
 
# Write data to SQL Server table
delta_df.write \
    .format("com.microsoft.sqlserver.jdbc.spark") \
    .mode("overwrite") \
    .option("url", "jdbc:sqlserver://<your_server>:<your_port>;database=<your_database>") \
    .option("dbtable", "<your_table>") \
    .option("user", "<your_username>") \
    .option("password", "<your_password>") \
    .option("tableLock", "true") \
    .option("bulkCopyBatchSize", "100000") \
    .option("bulkCopyTableLock", "true") \
    .option("bulkCopyTimeout", "600") \
    .option("batchsize", "100000") \
    .option("reliabilityLevel", "BEST_EFFORT") \
    .option("tempDir", "/mnt/temp") \
    .option("checkpointLocation", "/mnt/checkpoint") \
    .option("mergeSchema", "true") \
    .option("createTableColumnTypes", "<column_types>") \
    .option("spark.connection.mode", "databricks") \
    .option("spark.databricks.delta.preview.enabled", "true") \
    .option("synapseStorageAccountUrl", "<your_storage_account_url>") \
    .option("synapseStorageAccountAccessKey", "<your_storage_account_access_key>") \
    .option("synapseLinkedServiceName", "<your_linked_service_name>") \
    .option("synapseWorkspaceName", "<your_workspace_name>") \
    .option("upsert", "true") \
    .option("primaryKey", "<your_primary_key>") \
    .save()

Make sure to replace the placeholders with your own values for the SQL Server connection and table details. Also, set the upsert option to true and specify the primary key column(s) using the primaryKey option to perform an upsert.

Note that you may need to adjust the other options based on your specific requirements and environment.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group