cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Executing Notebooks - Run All Cells vs Run All Below

lei_armstrong
New Contributor II

Due to dependencies, if one of our cells errors then we want the notebook to stop executing.

We've noticed some odd behaviour when executing notebooks depending on if "Run all cells in this notebook" is selected from the header versus "Run All Below".

In the example code below I've added an extra bracket to force the python to fail and raise an exception. If I run the notebook using "Run all cells in this notebook" then the exception is correctly thrown and the rest of the cells/commands in the notebook are skipped.

However, if I use "Run All Below" then all cells are executed regardless of any exceptions or failures. We're tried using dbutils.notebook.exit but it doesn't work and subsequent cells are still run.

Is this the intended behaviour? It's frustrating when trying to rerun just part of a notebook.

Example Code:

Cmd 1

%python

try:

 spark.sql("""SELECT NOW()) AS force_error""")

except:

 print("Error running SQL")

 raise Exception("Error running SQL")

Cmd 2

SELECT NOW())

Cmd 3

SELECT NOW()

1 ACCEPTED SOLUTION

Accepted Solutions

Dan_Z
Honored Contributor

@Laura McGill​ , this question just came up again here:

https://community.databricks.com/s/question/0D53f00001PonToCAJ/executing-notebooks-run-all-cells-vs-...

I went ahead and created an internal feature request. You can refer to it with DB-I-4250. If you'd like to help prioritize it, please contact your Account Executive or Customer Success Engineer and tell them your company needs DB-I-4250. Being able to attach your company name to feature requests helps the product team prioritize them.

Cheers!

View solution in original post

8 REPLIES 8

Kaniz_Fatma
Community Manager
Community Manager

Hi @lei_armstrong! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will get back to you soon. Thanks.

-werners-
Esteemed Contributor III

That is how it works. When you do 'run all', the cells after the error are skipped.

When you do run all below, they all get executed (and mostly throw an error).

Databricks probably has a reason for this behavior. The notebook experience is not bad but certainly not perfect.

Dan_Z
Honored Contributor

Yes- this is expected behavior. Usually customers want all the cells to run regardless if one failed in between. Usually future cells will error our if they are dependent anyway. If you can suggest a feature that will provide that functionality I can make a feature request on your behalf.

Kaniz_Fatma
Community Manager
Community Manager

Hi, If you want a new feature to be added, you can request the feature here at this link:-

https://docs.databricks.com/resources/ideas.html

lei_armstrong
New Contributor II

Thanks all for your responses, just seems strange the behaviour differs and isn't documented publicly (that I can find). Ideally it would be configurable as part of the notebook settings. I've raised a new feature request here : Notebook Settings for handling errors · Community (azure.com)

Dan_Z
Honored Contributor

@Laura McGill​ , this question just came up again here:

https://community.databricks.com/s/question/0D53f00001PonToCAJ/executing-notebooks-run-all-cells-vs-...

I went ahead and created an internal feature request. You can refer to it with DB-I-4250. If you'd like to help prioritize it, please contact your Account Executive or Customer Success Engineer and tell them your company needs DB-I-4250. Being able to attach your company name to feature requests helps the product team prioritize them.

Cheers!

pinecone
New Contributor II

I second this request. It's odd that the behaviour is different when running all vs. running all below. Please make it consistent and document properly.

sukanya09
New Contributor II

Has this been implemented? I have created a job using notebook. My notebook has 6 cells and if the code in first cell fails it should not run the rest of the cells 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group