cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

java.lang.NoSuchMethodError after upgrade to Databricks Runtime 13

ForestDD
New Contributor

We use spark mssql connector to connect sql server, it works well on dbr runtime 10.*, 11.* and 12.*. But when we use dbr 13.*, we got the error below. It happens when we are trying to use df.write to save the data to the sql database.

We have encountered similar error before when we upgrades from 10.4 to 11.3. But after we changed to the latest mssql connector: com.microsoft.azure:spark-mssql-connector_2.12:1.3.0-BETA.,from com.microsoft.azure:spark-mssql-connector_2.12:1.2.0 , it works.

Now, after checking, there's no update for the spark mssql connector, the latest one is still 1.3.0-BETA.

How could we resolve the issue? We don't want to stay on dbr runtime 12.2 because we'd like to use some new features such as unity catalog. thanks.

 

 py4j.protocol.Py4JJavaError: An error occurred while calling o811.save. : java.lang.NoSuchMethodError: org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(Ljava/sql/ResultSet;Lorg/apache/spark/sql/jdbc/JdbcDialect;Z)Lorg/apache/spark/sql/types/StructType; at com.microsoft.sqlserver.jdbc.spark.BulkCopyUtils$.matchSchemas(BulkCopyUtils.scala:305) at com.microsoft.sqlserver.jdbc.spark.BulkCopyUtils$.getColMetaData(BulkCopyUtils.scala:266) at com.microsoft.sqlserver.jdbc.spark.Connector.write(Connector.scala:66) at com.microsoft.sqlserver.jdbc.spark.DefaultSource.createRelation(DefaultSource.scala:66) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:49) at org.apache.spark.sql.execution.command.ExecutedCommandExec.$anonfun$sideEffectResult$1(commands.scala:82) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:80) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:79) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:91) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$3(QueryExecution.scala:272) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:166) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$2(QueryExecution.scala:272) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$8(SQLExecution.scala:274) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:498) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:201) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:1113) at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:151) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:447) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$1(QueryExecution.scala:271) at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$withMVTagsIfNecessary(QueryExecution.scala:245) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:266) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:251) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:465) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:69) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:465) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:33) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:316) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:312) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:33) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:33) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:441) at org.apache.spark.sql.execution.QueryExecution.$anonfun$eagerlyExecuteCommands$1(QueryExecution.scala:251) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:372) at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:251) at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:203) at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:200) at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:336) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:956) at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:424) at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:391) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:258) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397) at py4j.Gateway.invoke(Gateway.java:306) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:195) at py4j.ClientServerConnection.run(ClientServerConnection.java:115) at java.lang.Thread.run(Thread.java:750)

5 REPLIES 5

Kaniz
Community Manager
Community Manager

Hi @ForestDDBased on the provided information, you seem to be facing an issue with Databricks Runtime (DBR) 13. and the MSSQL connector.

Since you've mentioned there are no new updates for the MSSQL connector, and it's still at version 1.3.0-BETA, which worked with DBR versions 10., 11., and 12., the issue might be related to the compatibility between DBR 13.* and the MSSQL connector version you are using.

Databricks Connect is recommended for Databricks Runtime 13.0 and higher ([source](https://docs.databricks.com/dev-tools/databricks-connect-legacy.html)).

It allows you to write jobs using Spark APIs and run them remotely on a cluster instead of in the local Spark session. It is built on open source Spark Connect, which introduces a decoupled client-server architecture that allows remote connectivity to Spark clusters using the DataFrame API and unresolved logical plans as the protocol.

However, there are certain limitations with Databricks Connect, such as it doesn't support running arbitrary code that is not part of a Spark job on the remote cluster, and it doesn't support native Scala, Python, and R APIs for Delta table operations ([source](https://docs.databricks.com/dev-tools/databricks-connect-legacy.html)).

To resolve the issue:

1. You can consider using Databricks Connect for Databricks Runtime 13.0 and higher, but it's essential to consider the limitations of Databricks Connect and see if they impact your requirements.
2. If the limitations of Databricks Connect impact your requirements, you may need to wait for an update of the MSSQL connector compatible with DBR 13.* or contact Databricks support for further assistance. 

Always test in a controlled environment before applying any changes to production.

Tharun-Kumar
Honored Contributor II
Honored Contributor II

@ForestDD I could see a pre-installed mssql library available as part of DBR 13.3 LTS.
https://docs.databricks.com/en/release-notes/runtime/13.3lts.html#:~:text=com.microsoft.sqlserver-,m...

Could you try using this and let us know if you are able to overcome the issue?

jimbo
New Contributor II

Did this work@ForestDD?

jimbo
New Contributor II

@Tharun-Kumar  any ideas if this is compatible with writing to SQL with 3.4.1 of Spark using your recommended connector above?

AradhanaSahu
New Contributor II

I was also facing the same issue while writing to a sql server. Was able to resolve it by updating the format to "jdbc" instead of "com.micorsoft.sqlserver.jdbc.spark".

df.write.format("jdbc") works on DBR 13.3 LTS using the connector: com.microsoft.azure:spark-mssql-connector_2.12:1.2.0

Source where the solution is mentioned: https://github.com/microsoft/sql-spark-connector/issues/191

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.