09-13-2023 08:37 AM
We use spark mssql connector to connect sql server, it works well on dbr runtime 10.*, 11.* and 12.*. But when we use dbr 13.*, we got the error below. It happens when we are trying to use df.write to save the data to the sql database.
We have encountered similar error before when we upgrades from 10.4 to 11.3. But after we changed to the latest mssql connector: com.microsoft.azure:spark-mssql-connector_2.12:1.3.0-BETA.,from com.microsoft.azure:spark-mssql-connector_2.12:1.2.0 , it works.
Now, after checking, there's no update for the spark mssql connector, the latest one is still 1.3.0-BETA.
How could we resolve the issue? We don't want to stay on dbr runtime 12.2 because we'd like to use some new features such as unity catalog. thanks.
py4j.protocol.Py4JJavaError: An error occurred while calling o811.save. : java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(Ljava/sql/ResultSet;Lorg/apache/spark/sql/jdbc/JdbcDialect;Z)Lorg/apache/spark/sql/types/StructType; at com.microsoft.sqlserver.jdbc.spark.BulkCopyUtils$.matchSchemas(BulkCopyUtils.scala:305) at com.microsoft.sqlserver.jdbc.spark.BulkCopyUtils$.getColMetaData(BulkCopyUtils.scala:266) at com.microsoft.sqlserver.jdbc.spark.Connector.write(Connector.scala:66) at com.microsoft.sqlserver.jdbc.spark.DefaultSource.createRelation(DefaultSource.scala:66) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:49) at org.apache.spark.sql.execution.command.ExecutedCommandExec.$anonfun$sideEffectResult$1(commands.scala:82) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:80) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:79) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:91) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$3(QueryExecution.scala:272) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:166) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$2(QueryExecution.scala:272) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$8(SQLExecution.scala:274) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:498) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:201) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:1113) at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:151) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:447) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$1(QueryExecution.scala:271) at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$withMVTagsIfNecessary(QueryExecution.scala:245) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:266) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:251) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:465) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:69) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:465) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:33) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:316) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:312) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:33) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:33) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:441) at org.apache.spark.sql.execution.QueryExecution.$anonfun$eagerlyExecuteCommands$1(QueryExecution.scala:251) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:372) at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:251) at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:203) at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:200) at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:336) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:956) at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:424) at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:391) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:258) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397) at py4j.Gateway.invoke(Gateway.java:306) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195) at py4j.ClientServerConnection.run(ClientServerConnection.java:115) at java.lang.Thread.run(Thread.java:750)
09-14-2023 06:24 AM
@ForestDD I could see a pre-installed mssql library available as part of DBR 13.3 LTS.
https://docs.databricks.com/en/release-notes/runtime/13.3lts.html#:~:text=com.microsoft.sqlserver-,m...
Could you try using this and let us know if you are able to overcome the issue?
11-18-2023 04:40 PM
Did this work@ForestDD?
11-18-2023 04:43 PM
@Tharun-Kumar any ideas if this is compatible with writing to SQL with 3.4.1 of Spark using your recommended connector above?
03-27-2024 07:51 AM
I was also facing the same issue while writing to a sql server. Was able to resolve it by updating the format to "jdbc" instead of "com.micorsoft.sqlserver.jdbc.spark".
df.write.format("jdbc") works on DBR 13.3 LTS using the connector: com.microsoft.azure:spark-mssql-connector_2.12:1.2.0
Source where the solution is mentioned: https://github.com/microsoft/sql-spark-connector/issues/191
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group