cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Why my calling notebook is not receiving the value of a variable in called notebook?

Saf4Databricks
Contributor

 Remarks: I though you can use %run command to make variables defined in one notebook available in another. The %run command executes the specified notebook inline within the current notebook's session, so all functions, variables, and DataFrames defined in the called notebook become accessible in the calling notebook.

PLEASE NOTE: Question is NOT about achieving the same using alternatives such as dbutils.notebook.exit() etc. It's rather what is the reason for this code is not working.

Notebook_A:

my_variable = "Hello from Notebook A"

Notebook_B:

cell_1

%run "./Notebook_A"

cell_2

# Now you can use the variable defined in Notebook_A
print(my_variable)

Output of cell_2:

my_variable

Expected output of cell_2:

Hello from Notebook_A

 

1 ACCEPTED SOLUTION

Accepted Solutions

Hi @Saf4Databricks,

I can see why..

In your original post, you executed the below.

print(my_variable)

but, in your snapshot, you are executing the below.. You are printing a string because you are using double quotes.

print("my_variable")

Snapshot with evidence..

Variables_2.jpg

 If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

View solution in original post

6 REPLIES 6

Ashwin_DSA
Databricks Employee
Databricks Employee

Hi @Saf4Databricks,

By default, variables are local to the notebook’s session and not visible elsewhere. Any sharing of variables across notebooks must be explicit. The main reason they aren’t shared is session isolation and implementation... Each notebook has its own interpreter process / REPL. In other words, each notebook has its own runtime.

Here is a reference you may find useful. Whilst it doesn't explicitly state that, the below note implies what I'm saying.

Variables.jpg

It's also worth noting that variables are ephemeral within the session... as in if the cluster/session is restarted or detached, they’re lost. 

Also, variables and state are isolated between different language REPLs. For example, Python variables are not accessible in Scala cells.

Does that answer your question?

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

Hi @Ashwin_DSA,

Thank you for sharing your input. But the following response from a Databricks employee @aladda on a relevant question here seems to indicate that a variable in called notebook is available in calling notebook. And another reference shown below from an article seems to indicate the same. Am I missing something in these references below?

Reference from @aladda:

  • %run is copying code from another notebook and executing it within the one its called from. All variables defined in the notebook being called are therefore visible to the caller notebook
  • dbutils.notebook.run() is more around executing different notebooks in a workflow, an orchestration of sorts. Each notebook runs in an isolated spark session and passing parameters and return values is through a strictly defined interface.

Reference (last paragraph) from an article here:

The %run magic command functions differently compared to the dbutils.notebook.run() method. Unlike the latter, the %run command does not execute the child notebook independently; instead, it imports the child notebook into the existing parent Spark session. Consequently, any variables or functions defined in the child notebook become accessible in the parent notebook. This behavior is akin to the way Python imports work.

Hi @Saf4Databricks,

Apologies, I focused on the post heading and completely overlooked the remarks you mentioned at the top, jumping straight to my response.

My earlier reply described the general isolation model for notebooks, but I didn’t explicitly separate it from the special behaviour of %run. I can confirm that %run still behaves as described in the older community answer and the article you linked... it executes the target notebook inline in the same session, and the functions/variables defined there become available in the caller notebook.

To prove that, I tested this in my workspace with the same example you provided. 

Notebook 1:

Variables_Notebook_1.jpg

Notebook 2:

Variables_Notebook_2.jpg

The general statement I made.... that variables are local to a notebook’s session and not visible elsewhere... still holds unless you explicitly import another notebook with %run. Given all this, if your exact minimal example is not working, it likely points to something environmental (a language mismatch between the two notebooks, different default languages in the cells, or the %run cell not actually being executed before the print).

Can you confirm...

  1. Both Notebook_A and Notebook_B show "Python”" as the default language under the notebook title, and
  2. Both are attached to the same cluster.

We can look into this more once you confirm.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

Hi @Ashwin_DSA,

Thank you for looking into this issue. I just confirmed that both the notebooks are run under the same cluster (serverless), both are in the same folder, and both have same language (python) as shown below. Before executing Notebook_B, I also made sure the cluster is running on both notebooks. As shown in image 2 below, the output of cell2 displays the variable name and not its value.

I'm using the New Free Edition of Databricks (that replaced the Community Edition).

Notebook_A:

Notebook_ANotebook_A

Notebok_B:

Notebook_BNotebook_B

 

Hi @Saf4Databricks,

I can see why..

In your original post, you executed the below.

print(my_variable)

but, in your snapshot, you are executing the below.. You are printing a string because you are using double quotes.

print("my_variable")

Snapshot with evidence..

Variables_2.jpg

 If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

Saf4Databricks
Contributor

Hi @Ashwin_DSA, thank you for pointing out the cause of the error. This post can now be locked/closed.