cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

'NotebookHandler' object has no attribute 'setContext' in pyspark streaming in AWS

Sreekanth_N
New Contributor II

I am facing issue while calling dbutils.notebook.run() inside of pyspark streaming with concurrent.executor. At first the error is "pyspark.sql.utils.IllegalArgumentException: Context not valid. If you are calling this outside the main thread,
you must set the Notebook context via dbutils.notebook.setContext(ctx), where ctx is a value retrieved from the main thread (and the same cell)
via dbutils.notebook.getContext()."  

But, I didn't see setContext and getContext in python code

2 REPLIES 2

Kevin3
New Contributor III

The error message you're encountering in PySpark when using dbutils.notebook.run() suggests that the context in which you are attempting to call the run() method is not valid. PySpark notebooks in Databricks have certain requirements when it comes to using dbutils.notebook.run() in a multi-threaded environment.

Here are some steps you can follow to address this issue:

Set Notebook Context:

  • As indicated in the error message, you should set the notebook context using dbutils.notebook.setContext(ctx) before calling dbutils.notebook.run(). The ctx value should be retrieved from the main thread in the same cell.

# Retrieve the context in the main thread
ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext()

# Set the context
dbutils.notebook.setContext(ctx)

# Now you can call dbutils.notebook.run()
dbutils.notebook.run(...) QR Code Scanner

The key is to set the notebook context correctly and ensure that any concurrent execution is managed in a way that does not interfere with the notebook's context and execution.

Arai
New Contributor III

Hello Kevin3,

In pyspark, the below line gets the context:
ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext()

But, this "dbutils.notebook.setContext(ctx)" doesn't set the Context()

I was searching the directory and found the method somewhat like this.

dbutils.notebook.entry_point.getDbutils().notebook().setContext(ctx)
but even after using this I am unable to start the child notebook from the parent notebook. It gives the below error:

"Exception due to : Context not valid. If you are calling this outside the main thread, you must set the Notebook context via dbutils.notebook.setContext(ctx), where ctx is a value retrieved from the main thread (and the same cell) via dbutils.notebook.getContext()."

Maybe I am not using it the correct way. Let me know if you have a workaround for this situation:

1. Attaching the functions that I am using to run the child notebook.

2. I am running the function "execute_child_nb" in the stream microbatch.Screenshot 2023-12-06 at 5.34.02 PM.png

Thanks,

Abhijit

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group