cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

The spark context has stopped and the driver is restarting. Your notebook will be automatically

Sachin_
New Contributor II

I am trying to execute a scala jar in notebook. When I execute it explicitly I am able to run the jar like this :

Sachin__1-1709881658170.pngSachin__2-1709881677411.png

but when I am trying to run a notebook through databricks workflow I get the below errorThe spark context has stopped and the driver is restarting. Your notebook will be automatically reattached.

Steps I have taken till now : 

tried increasing spark driver memory like this : 

Sachin__3-1709881874830.png

 

 

4 REPLIES 4

Kaniz
Community Manager
Community Manager

Hi @Sachin_, To run a Scala JAR within a Databricks notebook, consider the following approaches:

  1. Additional Considerations:

    • Ensure that the JAR file is accessible from the Databricks environment.
    • Check for any specific environment variables or configurations required by your Scala application.
    • Review Databricks documentation for best practices on running external JARs.

Remember that Databricks provides a powerful platform for big data processing, but sometimes specific adjustments are needed to ensure smooth execution. Feel free to provide more details or ask for further assistance if needed! 🚀

 

Sachin_
New Contributor II

Hello @Kani ! Thanks for your time. We have certain practices followed in our organisation to trigger the Jar which is by using py4j. So, I tried getting the logs which I was not getting earlier. The error is as below : 

'{"error_code": 1, "error_message": "py4j does not exist in the JVM -- org.apache.spark.SparkException: Trying to putInheritedProperty with no active spark context\n\tat org.apache.spark.credentials.CredentialContext$.$anonfun$putInheritedProperty$2(CredentialContext.scala:188)\n\tat scala.Option.getOrElse(Option.scala:189)\n\tat org.apache.spark.credentials.CredentialContext$.$anonfun$putInheritedProperty$1(CredentialContext.scala:188)\n\tat scala.Option.getOrElse(Option.scala:189)\n\tat org.apache.spark.credentials.CredentialContext$.putInheritedProperty(CredentialContext.scala:187)\n\tat com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.$anonfun$run$2(SparkThreadLocalUtils.scala:56)\n\tat com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.$anonfun$run$2$adapted(SparkThreadLocalUtils.scala:56)\n\tat scala.Option.foreach(Option.scala:407)\n\tat com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.run(SparkThreadLocalUtils.scala:56)\n\tat java.lang.Iterable.forEach(Iterable.java:75)\n\tat py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:194)\n\tat py4j.ClientServerConnection.run(ClientServerConnection.java:115)\n\tat java.lang.Thread.run(Thread.java:750)\n"}', -- NEW Internal data field
'1', -- NEW error code field
NULL -- NEW Write ID field
-- Forward compatibility padding
)
2024-03-15 10:29:49,643:offer_toolbox.retry:DEBUG:Last attempt for 'robust_execute'
2024-03-15 10:29:49,644:offer_metadata.v1.store:DEBUG:Insert event from Step v1
2024-03-15 10:29:49,645:offer_metadata.sql.engine:DEBUG:End cursor self._connection_count=1 CALLED:keep_alive=False FIXED:self.keep_alive=True
2024-03-15 10:29:49,646:offer_metadata.v1.operational:INFO:[AAM_Demo][demo_action] END Step
2024-03-15 10:29:50,711:offer_companion.companion:WARNING:Intercept IPython error py4j does not exist in the JVM -- org.apache.spark.SparkException: Trying to putInheritedProperty with no active spark context
at org.apache.spark.credentials.CredentialContext$.$anonfun$putInheritedProperty$2(CredentialContext.scala:188)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.credentials.CredentialContext$.$anonfun$putInheritedProperty$1(CredentialContext.scala:188)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.credentials.CredentialContext$.putInheritedProperty(CredentialContext.scala:187)
at com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.$anonfun$run$2(SparkThreadLocalUtils.scala:56)
at com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.$anonfun$run$2$adapted(SparkThreadLocalUtils.scala:56)
at scala.Option.foreach(Option.scala:407)
at com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.run(SparkThreadLocalUtils.scala:56)
at java.lang.Iterable.forEach(Iterable.java:75)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:194)
at py4j.ClientServerConnection.run(ClientServerConnection.java:115)
at java.lang.Thread.run(Thread.java:750)

Any idea on how I can initialize the cluster with py4j?

jose_gonzalez
Moderator
Moderator

Could you share the code you have in your JAR file? how are you creating your Spark context in your JAR file?

 

Hello @jose_gonzalez and @Kani ! Apologies foe the late reply. This is how we initialize spark session in ETL

Sachin__0-1712220666036.png