Databricks JDBC & Remote Write
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-04-2022 12:07 PM
Hello,
I'm trying to write to a Delta Table in my Databricks instance from a remote Spark session on a different cluster with the Simba Spark driver. I can do reads, but when I attempt to do a write, I get the following error:
{
df.write.format("jdbc").mode(SaveMode.Append).options(Map(
"url" -> "jdbc:spark://adb-<host_id>.azuredatabricks.net:443/default;transportMode=http;ssl=1;httpPath=<http_path>;AuthMech=3;UID=token;PWD=<token>",
"dbtable" -> "testtable",
"driver" -> "com.simba.spark.jdbc.Driver"
)).save()
}
java.sql.SQLFeatureNotSupportedException: [Simba][JDBC](10220) Driver does not support this optional feature.
at com.simba.spark.exceptions.ExceptionConverter.toSQLException(Unknown Source)
at com.simba.spark.jdbc.common.SPreparedStatement.checkTypeSupported(Unknown Source)
at com.simba.spark.jdbc.common.SPreparedStatement.setNull(Unknown Source)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.savePartition(JdbcUtils.scala:677)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$saveTable$1(JdbcUtils.scala:856)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$saveTable$1$adapted(JdbcUtils.scala:854)
at org.apache.spark.rdd.RDD.$anonfun$foreachPartition$2(RDD.scala:1020)
......
I'm currently using my "Data Science & Engineering" section's cluster to do the connection, where in the Advanced section there are the details to connect via JDBC/ODBC. Some guides indicate to use a SQL Endpoint for this, and that might be my problem, but I do not have permissions to create one at this time. Some posts around, like on StackOverflow, indicate it's an issue with the autocommit feature and that is not supported by the Simba Spark driver, but I'm unsure and I couldn't find a Spark or driver option that indicated to turn that off.
Also, all the documentation for doing spark.writes seem to be for when operating in a notebook instance on the Databricks server and no remote connection examples using the driver. Am I missing where a documentation page for that would be?
Remote Spark Instance
-------------------------------
Spark Version: 3.1.1
Scala Version: 2.1210
Spark Simba JDBC Driver from Databricks: 2.6.22
Databricks Cluster Settings
---------------------
Cloud System: Azure
Policy: Unrestricted
Cluster Mode: Standard
Autoscaling: Enabled
Databricks Runtime Version: 9.1 LTS (includes Apache 3.1.2, Scala 2.12)
Worker & Driver Type: Standard_DS3_v2
Please let me know if you need any other information to help me address my issue.
Thank you,
Kai
- Labels:
-
Databricks Instance
-
Logs
-
SQL