Here you go: https://gist.github.com/daschl/c2528af17af727d0688f4366d2177498 .. I ran a local master and worker and then published the same app into it with spark-submit. Note that our connector in this case is provided via --jars

(./spark-submit --jars ~/code/couchbase-spark-connector/target/scala-2.12/spark-connector-assembly-3.2.0-SNAPSHOT.jar --conf "spark.executor.extraJavaOptions=-verbose:class" --master spark://machine.local:7077 ~/code/scala/spark3-examples/target/scala-2.12/spark3-examples_2.12-1.0.0-SNAPSHOT.jar)

From the logs I can see that locally this is loaded:

[7.160s][info][class,load] org.apache.spark.sql.catalyst.json.CreateJacksonParser$ source: file:/Users/myuser/Downloads/spark-3.2.0-bin-hadoop3.2/jars/spark-catalyst_2.12-3.2.0.jar

But it seems to be missing in the databricks environment.

AV
New Contributor III

Hello @Xin Wang​  thank you so much for your response earlier. totally understand. I think @Michael Nitschinger​  has provided you with the information thats needed. Should that suffice your needs for debugging ? please let us know, this is a blocker for us and our customers.

daschl
Contributor

Since there hasn't been any progress on this for over a month, I applied a workaround and copied the classes into the connector source code so we don't have to rely on the databricks classloader. It seems to work in my testing and will be released with the next minor version (connector 3.2.0). Nonetheless I still think this is an issue in the databricks notebook and should be addressed on your side?

View solution in original post