cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Insert Into SQLServer Table

Unimog
New Contributor III

I'm trying to insert and update data in an SQLServer table from a python script.  No matter what I try, it seems to give me this error:
The input query contains unsupported data source(s). Only csv, json, avro, delta, kafka, parquet, orc, text, unity_catalog, binaryFile, xml, simplescan, iceberg, mysql, postgresql, sqlserver, redshift, snowflake, sqldw, databricks, bigquery, oracle, salesforce, salesforce_data_cloud, teradata, workday_raas, mongodb data sources are supported on serverless compute...

- I've tried from spark.sql on serverless and non-serverless compute. 
- I've tried updating using dataframe.write...
- I've tried over a federated connection as well as opening a jdbc connection.
- Based on the message and suggestions from the assistant, I've tried writing the input data to a delta table first, then updating from there.

Is anyone out there able to write to an existing sql server table who could give me some hints?

1 ACCEPTED SOLUTION

Accepted Solutions

Nivethan_Venkat
Contributor

Hi @Unimog,

Currently the support for data sources are limited to as mentioned in the General Limitations for serverless compute as of now: General Serverless Limitations 

  • Support for data sources is limited to AVRO, BINARYFILE, CSV, DELTA, JSON, KAFKA, ORC, PARQUET, ORC, TEXT, and XML.

Hopefully, it will soon be enabled for more data formats including sqlserver as mentioned in your dataframe write query. But when you are using Delta as a format it should work ideally without any issues.

View solution in original post

5 REPLIES 5

Nivethan_Venkat
Contributor

Hi @Unimog, Is it possible to provide the dataframe write statement? Or more details on the error apart from allowed file formats

Unimog
New Contributor III
Hi Nivethan,  I have tried a bunch of methods to do this.  Here is one:
update_df.write 
.format("sqlserver"
.mode("append"
.option("host", "xxx.database.windows.net"
.option("user", "xxx"
.option("password", "xxx"
.option("dbtable", "xxx"
.option("batchsize", 20000
.option("mssqlIsolationLevel", "READ_UNCOMMITTED")
.save()

Unimog
New Contributor III

Here is the full text of the error message:

The input query contains unsupported data source(s). Only csv, json, avro, delta, kafka, parquet, orc, text, unity_catalog, binaryFile, xml, simplescan, iceberg, mysql, postgresql, sqlserver, redshift, snowflake, sqldw, databricks, bigquery, oracle, salesforce, salesforce_data_cloud, teradata, workday_raas, mongodb data sources are supported on serverless compute, and only csv, json, avro, delta, kafka, parquet, orc, text, unity_catalog, binaryFile, xml, simplescan, iceberg data sources are allowed to run DML on serverless compute.

Nivethan_Venkat
Contributor

Hi @Unimog,

Currently the support for data sources are limited to as mentioned in the General Limitations for serverless compute as of now: General Serverless Limitations 

  • Support for data sources is limited to AVRO, BINARYFILE, CSV, DELTA, JSON, KAFKA, ORC, PARQUET, ORC, TEXT, and XML.

Hopefully, it will soon be enabled for more data formats including sqlserver as mentioned in your dataframe write query. But when you are using Delta as a format it should work ideally without any issues.

I thought I had already tried this on general purpose compute, but apparently not.  After reading the docs you referenced, I retried using standard compute and it works perfectly.

Thanks!

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now