cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

The spark context has stopped and the driver is restarting. Your notebook will be automatically

Sachin_
New Contributor II

I am trying to execute a scala jar in notebook. When I execute it explicitly I am able to run the jar like this :

Sachin__1-1709881658170.pngSachin__2-1709881677411.png

but when I am trying to run a notebook through databricks workflow I get the below errorThe spark context has stopped and the driver is restarting. Your notebook will be automatically reattached.

Steps I have taken till now : 

tried increasing spark driver memory like this : 

Sachin__3-1709881874830.png

 

 

4 REPLIES 4

Kaniz
Community Manager
Community Manager

Hi @Sachin_, To run a Scala JAR within a Databricks notebook, consider the following approaches:

  1. Additional Considerations:

    • Ensure that the JAR file is accessible from the Databricks environment.
    • Check for any specific environment variables or configurations required by your Scala application.
    • Review Databricks documentation for best practices on running external JARs.

Remember that Databricks provides a powerful platform for big data processing, but sometimes specific adjustments are needed to ensure smooth execution. Feel free to provide more details or ask for further assistance if needed! ๐Ÿš€

 

Sachin_
New Contributor II

Hello @Kani ! Thanks for your time. We have certain practices followed in our organisation to trigger the Jar which is by using py4j. So, I tried getting the logs which I was not getting earlier. The error is as below : 

'{"error_code": 1, "error_message": "py4j does not exist in the JVM -- org.apache.spark.SparkException: Trying to putInheritedProperty with no active spark context\n\tat org.apache.spark.credentials.CredentialContext$.$anonfun$putInheritedProperty$2(CredentialContext.scala:188)\n\tat scala.Option.getOrElse(Option.scala:189)\n\tat org.apache.spark.credentials.CredentialContext$.$anonfun$putInheritedProperty$1(CredentialContext.scala:188)\n\tat scala.Option.getOrElse(Option.scala:189)\n\tat org.apache.spark.credentials.CredentialContext$.putInheritedProperty(CredentialContext.scala:187)\n\tat com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.$anonfun$run$2(SparkThreadLocalUtils.scala:56)\n\tat com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.$anonfun$run$2$adapted(SparkThreadLocalUtils.scala:56)\n\tat scala.Option.foreach(Option.scala:407)\n\tat com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.run(SparkThreadLocalUtils.scala:56)\n\tat java.lang.Iterable.forEach(Iterable.java:75)\n\tat py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:194)\n\tat py4j.ClientServerConnection.run(ClientServerConnection.java:115)\n\tat java.lang.Thread.run(Thread.java:750)\n"}', -- NEW Internal data field
'1', -- NEW error code field
NULL -- NEW Write ID field
-- Forward compatibility padding
)
2024-03-15 10:29:49,643:offer_toolbox.retry:DEBUG:Last attempt for 'robust_execute'
2024-03-15 10:29:49,644:offer_metadata.v1.store:DEBUG:Insert event from Step v1
2024-03-15 10:29:49,645:offer_metadata.sql.engine:DEBUG:End cursor self._connection_count=1 CALLED:keep_alive=False FIXED:self.keep_alive=True
2024-03-15 10:29:49,646:offer_metadata.v1.operational:INFO:[AAM_Demo][demo_action] END Step
2024-03-15 10:29:50,711:offer_companion.companion:WARNING:Intercept IPython error py4j does not exist in the JVM -- org.apache.spark.SparkException: Trying to putInheritedProperty with no active spark context
at org.apache.spark.credentials.CredentialContext$.$anonfun$putInheritedProperty$2(CredentialContext.scala:188)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.credentials.CredentialContext$.$anonfun$putInheritedProperty$1(CredentialContext.scala:188)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.credentials.CredentialContext$.putInheritedProperty(CredentialContext.scala:187)
at com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.$anonfun$run$2(SparkThreadLocalUtils.scala:56)
at com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.$anonfun$run$2$adapted(SparkThreadLocalUtils.scala:56)
at scala.Option.foreach(Option.scala:407)
at com.databricks.backend.daemon.driver.SparkThreadLocalUtils$$anon$1.run(SparkThreadLocalUtils.scala:56)
at java.lang.Iterable.forEach(Iterable.java:75)
at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:194)
at py4j.ClientServerConnection.run(ClientServerConnection.java:115)
at java.lang.Thread.run(Thread.java:750)

Any idea on how I can initialize the cluster with py4j?

jose_gonzalez
Moderator
Moderator

Could you share the code you have in your JAR file? how are you creating your Spark context in your JAR file?

 

Hello @jose_gonzalez and @Kani ! Apologies foe the late reply. This is how we initialize spark session in ETL

Sachin__0-1712220666036.png

 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.