cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
cancel
Showing results for 
Search instead for 
Did you mean: 

'NotebookHandler' object has no attribute 'setContext' in pyspark streaming in AWS

Sreekanth_N
New Contributor II

I am facing issue while calling dbutils.notebook.run() inside of pyspark streaming with concurrent.executor. At first the error is "pyspark.sql.utils.IllegalArgumentException: Context not valid. If you are calling this outside the main thread,
you must set the Notebook context via dbutils.notebook.setContext(ctx), where ctx is a value retrieved from the main thread (and the same cell)
via dbutils.notebook.getContext()."  

But, I didn't see setContext and getContext in python code

3 REPLIES 3

Kaniz
Community Manager
Community Manager

Hi @Sreekanth_N , The error message "pyspark.sql.utils.IllegalArgumentException: Context not valid. If you are calling this outside the main thread, you must set the Notebook context via dbutils.notebook.setContext(ctx), where ctx is a value retrieved from the main thread (and the same cell) via dbutils.notebook.getContext()" indicates that the dbutils.notebook.run() function is being called outside of the main thread and the Notebook context is not set.To resolve this issue, you need to set the Notebook context using the dbutils.notebook.setContext(ctx) function, where ctx is a value retrieved from the main thread using dbutils.notebook.getContext().

However, based on the given information, it seems that the setContext and getContext functions are not available in the Python code.

It's possible that these functions are specific to a different programming language or environment.

Kevin3
New Contributor III

The error message you're encountering in PySpark when using dbutils.notebook.run() suggests that the context in which you are attempting to call the run() method is not valid. PySpark notebooks in Databricks have certain requirements when it comes to using dbutils.notebook.run() in a multi-threaded environment.

Here are some steps you can follow to address this issue:

Set Notebook Context:

  • As indicated in the error message, you should set the notebook context using dbutils.notebook.setContext(ctx) before calling dbutils.notebook.run(). The ctx value should be retrieved from the main thread in the same cell.

# Retrieve the context in the main thread
ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext()

# Set the context
dbutils.notebook.setContext(ctx)

# Now you can call dbutils.notebook.run()
dbutils.notebook.run(...) QR Code Scanner

The key is to set the notebook context correctly and ensure that any concurrent execution is managed in a way that does not interfere with the notebook's context and execution.

Arai
New Contributor III

Hello Kevin3,

In pyspark, the below line gets the context:
ctx = dbutils.notebook.entry_point.getDbutils().notebook().getContext()

But, this "dbutils.notebook.setContext(ctx)" doesn't set the Context()

I was searching the directory and found the method somewhat like this.

dbutils.notebook.entry_point.getDbutils().notebook().setContext(ctx)
but even after using this I am unable to start the child notebook from the parent notebook. It gives the below error:

"Exception due to : Context not valid. If you are calling this outside the main thread, you must set the Notebook context via dbutils.notebook.setContext(ctx), where ctx is a value retrieved from the main thread (and the same cell) via dbutils.notebook.getContext()."

Maybe I am not using it the correct way. Let me know if you have a workaround for this situation:

1. Attaching the functions that I am using to run the child notebook.

2. I am running the function "execute_child_nb" in the stream microbatch.Screenshot 2023-12-06 at 5.34.02 PM.png

Thanks,

Abhijit