cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Send UPDATE from Databricks to Azure SQL DataBase

LukaszJ
Contributor III

Hello.

I want to know how to do an UPDATE on Azure SQL DataBase from Azure Databricks using PySpark.

I know how to make query as SELECT and turn it into DataFrame, but how to send back some data (as UPDATE on rows)?

I want to use build in pyspark istead of some pyodbc or something else.

Best Regards,

Łukasz

1 ACCEPTED SOLUTION

Accepted Solutions

-werners-
Esteemed Contributor III

The code you mention is for Azure Synapse. If you want to use plain Azure SQL, your options are limited.

You can use JDBC, but there is no update functionality.

Or you can use the optimized spark connector, but that lacks DML support:

Executing custom SQL through the connector

The previous Azure SQL Connector for Spark provided the ability to execute custom SQL code like DML or DDL statements through the connector. This functionality is out-of-scope of this connector since it is based on the DataSource APIs. This functionality is readily provided by libraries like pyodbc or you can use the standard java sql interfaces as well.

View solution in original post

5 REPLIES 5

Kaniz_Fatma
Community Manager
Community Manager

Hi @Łukasz Jaremek​ ! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will get back to you soon. Thanks.

-werners-
Esteemed Contributor III

This is discussed on Stack Overflow. As you see for Azure Synapse there is a way, but for a plain SQL database you will have to use some kind of driver like odbc/jdbc.

Hello werners,

Thank you for your reply.

In your link I found this link, where is this peace of code:

# Load data from an Azure Synapse query.
df = spark.read \
[...]
 
# Apply some transformations to the data, then use the
# Data Source API to write the data back to another table in Azure Synapse.
 
df.write \
  .format("com.databricks.spark.sqldw") \
  .option("url", "jdbc:sqlserver://<the-rest-of-the-connection-string>") \
  .option("forwardSparkAzureStorageCredentials", "true") \
  .option("dbTable", "<your-table-name>") \
  .option("tempDir", "wasbs://<your-container-name>@<your-storage-account-name>.blob.core.windows.net/<your-directory-name>") \
  .save()

However, it is not a good solution for me.

First of all, what if I want to update one row in the table that has a few milions records?

Do I need to download the entire table into dataframe, update one row and resend to Azure SQL? It looks really ineffective.

Moreover, what if I want to make update on table which has an auto-indexing column? I cannot download the table to a dataframe, update with my own id value, and resend to Azure SQL.

That is why I am using PyOdbc now, because I can make real sql insert / update.

Is there any elegant solution at Databricks?

Best regards,

Łukasz

-werners-
Esteemed Contributor III

The code you mention is for Azure Synapse. If you want to use plain Azure SQL, your options are limited.

You can use JDBC, but there is no update functionality.

Or you can use the optimized spark connector, but that lacks DML support:

Executing custom SQL through the connector

The previous Azure SQL Connector for Spark provided the ability to execute custom SQL code like DML or DDL statements through the connector. This functionality is out-of-scope of this connector since it is based on the DataSource APIs. This functionality is readily provided by libraries like pyodbc or you can use the standard java sql interfaces as well.

This is the information I needed.

Thank you for your help.

Regards,

Łukasz

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group