- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ11-03-2021 10:31 AM
Using Databricks spark submit job, setting new cluster
1] "spark_version": "8.2.x-scala2.12" => OK, works fine
2] "spark_version": "9.1.x-scala2.12" => FAIL, with errors
Exception in thread "main" java.lang.ExceptionInInitializerError
at com.databricks.backend.daemon.driver.WSFSCredentialForwardingHelper.withWSFSCredentials(WorkspaceLocalFileSystem.scala:156)
at com.databricks.backend.daemon.driver.WSFSCredentialForwardingHelper.withWSFSCredentials$(WorkspaceLocalFileSystem.scala:155)
at com.databricks.backend.daemon.driver.WorkspaceLocalFileSystem.withWSFSCredentials(WorkspaceLocalFileSystem.scala:30)
at com.databricks.backend.daemon.driver.WorkspaceLocalFileSystem.getFileStatus(WorkspaceLocalFileSystem.scala:63)
at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57)
at org.apache.hadoop.fs.Globber.glob(Globber.java:252)
at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1657)
at org.apache.spark.deploy.DependencyUtils$.resolveGlobPath(DependencyUtils.scala:192)
at org.apache.spark.deploy.DependencyUtils$.$anonfun$resolveGlobPaths$2(DependencyUtils.scala:147)
at org.apache.spark.deploy.DependencyUtils$.$anonfun$resolveGlobPaths$2$adapted(DependencyUtils.scala:145)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245)
at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)
at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38)
at scala.collection.TraversableLike.flatMap(TraversableLike.scala:245)
at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:242)
at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
at org.apache.spark.deploy.DependencyUtils$.resolveGlobPaths(DependencyUtils.scala:145)
at org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$4(SparkSubmit.scala:363)
at scala.Option.map(Option.scala:230)
at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:363)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.NullPointerException
at com.databricks.backend.daemon.driver.WsfsDriverHttpClient.<init>(WSFSDriverHttpClient.scala:26)
at com.databricks.backend.daemon.driver.WSFSCredentialForwardingHelper$.<init>(WorkspaceLocalFileSystem.scala:277)
at com.databricks.backend.daemon.driver.WSFSCredentialForwardingHelper$.<clinit>(WorkspaceLocalFileSystem.scala)
... 28 more
- Labels:
-
Databricks spark
-
Spark
-
Spark Version
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ11-10-2021 02:14 PM
this has been resolved by adding the following spark_conf (not thru --conf)
"spark.hadoop.fs.file.impl": "org.apache.hadoop.fs.LocalFileSystem"
example:
------
"new_cluster": {
"spark_version": "9.1.x-scala2.12",
...
"spark_conf": {
"spark.hadoop.fs.file.impl": "org.apache.hadoop.fs.LocalFileSystem"
}
},
"spark_submit_task": {
"parameters": [
"--packages",
"org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1",
...
------------

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ11-03-2021 01:04 PM
@Raymund Beltranโ - So everything works as expected now? Is that right? If yes, would you be happy to mark your answer as best so others can find it easily?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ11-08-2021 07:37 AM
@Piper Wilsonโ I removed the comment that its working. Issue still exists with 9.1. Kafka streaming jars doesn't exists. It throws error when kafka streaming jars are not provided and in pyspark code kafka streaming is used. When the kafka streaming jars are explicitly provided in pyspark packages, it throws same error as original issue above

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ11-08-2021 08:18 AM
@Raymund Beltranโ - Thanks for letting us know. Let's see what the community has to say about this. We'll circle back if we need to.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ11-09-2021 12:39 PM
Additional info:
Using databricks spark-submit api with pyspark
"spark_submit_task": {
"parameters": [
"--packages",
"org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1",
....

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ11-09-2021 01:12 PM
Thank you. I'm passing the information on. Thanks for your patience!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ11-10-2021 02:14 PM
this has been resolved by adding the following spark_conf (not thru --conf)
"spark.hadoop.fs.file.impl": "org.apache.hadoop.fs.LocalFileSystem"
example:
------
"new_cluster": {
"spark_version": "9.1.x-scala2.12",
...
"spark_conf": {
"spark.hadoop.fs.file.impl": "org.apache.hadoop.fs.LocalFileSystem"
}
},
"spark_submit_task": {
"parameters": [
"--packages",
"org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1",
...
------------
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
โ11-18-2021 10:01 AM
Thank you for sharing the solution to this issue. I think I saw another question with the same error message.

