07-11-2024 03:55 PM
Hello,
Is there an equivalent SQL code for the following Pyspark code? I'm trying to copy a table from SQL Server to Databricks and save it as a managed delta table.
jdbcHostname = "your_sql_server_hostname"
jdbcPort = 1433
jdbcDatabase = "your_database_name"
jdbcUsername = "your_username"
jdbcPassword = "your_password"
# JDBC URL format for SQL Server
jdbcUrl = f"jdbc:sqlserver://{jdbcHostname}:{jdbcPort};database={jdbcDatabase}"
# Connection properties
connectionProperties = {
"user" : jdbcUsername,
"password" : jdbcPassword,
"driver" : "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}
df = spark.read.jdbc(url=jdbcUrl, table=query, properties=connectionProperties)
df.write.format("delta").mode("overwrite").saveAsTable("table_name")
07-12-2024 12:04 AM
I hope you are asking for a SQL version of the pyspark code ? Can you please explain the advantages of having it in SQL as compared to pyspark ? There are some options, best would be federated queries against sql server ? select as if it were a databricks table and write it to the target ? alternatively you could create a view against the sql server table and then use that as insert into the databricks table...but in my limited understanding (could be wrong) all of it would basically get optimized similarly in the background and serves no additional benefits to be rewritten in SQL...
07-12-2024 12:04 AM
I hope you are asking for a SQL version of the pyspark code ? Can you please explain the advantages of having it in SQL as compared to pyspark ? There are some options, best would be federated queries against sql server ? select as if it were a databricks table and write it to the target ? alternatively you could create a view against the sql server table and then use that as insert into the databricks table...but in my limited understanding (could be wrong) all of it would basically get optimized similarly in the background and serves no additional benefits to be rewritten in SQL...
07-15-2024 01:19 PM
@ranged_coop Yes, I'm asking for a SQL version of my Pyspark code. The only reason is to give it to person who only code in SQL which would make it easier for them to understand. Thanks for the suggested solution!
07-12-2024 12:08 AM
The only option to have it in Databricks SQL is lakehouse federation with a SQL Server connection.
07-15-2024 01:19 PM
Thank you @jacovangelder
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group