Databricks Community

Sachin_ · ‎03-07-2024

I am trying to execute a scala jar in notebook. When I execute it explicitly I am able to run the jar like this :

but when I am trying to run a notebook through databricks workflow I get the below error : The spark context has stopped and the driver is restarting. Your notebook will be automatically reattached.

Steps I have taken till now :

tried increasing spark driver memory like this :

Sachin_ · ‎03-17-2024

Hello @Kani ! Thanks for your time. We have certain practices followed in our organisation to trigger the Jar which is by using py4j. So, I tried getting the logs which I was not getting earlier. The error is as below :

'{"error_code": 1, "error_message": "py4j does not exist in the JVM -- org.apache.spark.SparkException: Trying to putInheritedProperty with no active spark context\n\tat org.apache.spark.credentials.CredentialContext$.$anonfun$putInheritedProperty$2(CredentialContext.scala:188)\n\tat scala.Option.getOrElse(Option.scala:189)\n\tat org.apache.spark.credentials.CredentialContext$.$anonfun$putInheritedProperty$1(CredentialContext.scala:188)\n\tat scala.Option.getOrElse(Option.scala:189)\n\tat org.apache.spark.credentials.CredentialContext$.putInheritedProperty(CredentialContext.scala:187)\n\tat com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.$anonfun$run$2(SparkThreadLocalUtils.scala:56)\n\tat com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.$anonfun$run$2$adapted(SparkThreadLocalUtils.scala:56)\n\tat scala.Option.foreach(Option.scala:407)\n\tat com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.run(SparkThreadLocalUtils.scala:56)\n\tat java.lang.Iterable.forEach(Iterable.java:75)\n\tat py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:194)\n\tat py4j.ClientServerConnection.run(ClientServerConnection.java:115)\n\tat java.lang.Thread.run(Thread.java:750)\n"}', -- NEW Internal data field
'1', -- NEW error code field
NULL -- NEW Write ID field
-- Forward compatibility padding
)
2024-03-15 10:29:49,643:offer_toolbox.retry:DEBUG:Last attempt for 'robust_execute'
2024-03-15 10:29:49,644:offer_metadata.v1.store:DEBUG:Insert event from Step v1
2024-03-15 10:29:49,645:offer_metadata.sql.engine:DEBUG:End cursor self._connection_count=1 CALLED:keep_alive=False FIXED:self.keep_alive=True
2024-03-15 10:29:49,646:offer_metadata.v1.operational:INFO:[AAM_Demo][demo_action] END Step
2024-03-15 10:29:50,711:offer_companion.companion:WARNING:Intercept IPython error py4j does not exist in the JVM -- org.apache.spark.SparkException: Trying to putInheritedProperty with no active spark context
at org.apache.spark.credentials.CredentialContext$.$anonfun$putInheritedProperty$2(CredentialContext.scala:188)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.credentials.CredentialContext$.$anonfun$putInheritedProperty$1(CredentialContext.scala:188)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.credentials.CredentialContext$.putInheritedProperty(CredentialContext.scala:187)
at com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.$anonfun$run$2(SparkThreadLocalUtils.scala:56)
at com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.$anonfun$run$2$adapted(SparkThreadLocalUtils.scala:56)
at scala.Option.foreach(Option.scala:407)
at com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.run(SparkThreadLocalUtils.scala:56)
at java.lang.Iterable.forEach(Iterable.java:75)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:194)
at py4j.ClientServerConnection.run(ClientServerConnection.java:115)
at java.lang.Thread.run(Thread.java:750)

Any idea on how I can initialize the cluster with py4j?

jose_gonzalez · ‎03-20-2024

Could you share the code you have in your JAR file? how are you creating your Spark context in your JAR file?

Sachin_ · ‎04-04-2024

Hello @jose_gonzalez and @Kani ! Apologies foe the late reply. This is how we initialize spark session in ETL

Databricks Community

The spark context has stopped and the driver is restarting. Your notebook will be automatically

Connect with Databricks Users in Your Area

Join Us as a Community Technical Moderator

Databricks Community Champion - October 2024 - Filip Niziol

Become Our Next Monthly Community Champion!

Introducing Simple, Fast, and Scalable Batch LLM Inference on Mosaic AI Model Serving

Databricks Migration Strategy: Lessons Learned