10-19-2021 05:57 AM
Due to dependencies, if one of our cells errors then we want the notebook to stop executing.
We've noticed some odd behaviour when executing notebooks depending on if "Run all cells in this notebook" is selected from the header versus "Run All Below".
In the example code below I've added an extra bracket to force the python to fail and raise an exception. If I run the notebook using "Run all cells in this notebook" then the exception is correctly thrown and the rest of the cells/commands in the notebook are skipped.
However, if I use "Run All Below" then all cells are executed regardless of any exceptions or failures. We're tried using dbutils.notebook.exit but it doesn't work and subsequent cells are still run.
Is this the intended behaviour? It's frustrating when trying to rerun just part of a notebook.
Example Code:
Cmd 1
%python
try:
spark.sql("""SELECT NOW()) AS force_error""")
except:
print("Error running SQL")
raise Exception("Error running SQL")
Cmd 2
SELECT NOW())
Cmd 3
SELECT NOW()
10-22-2021 07:52 AM
@Laura McGill , this question just came up again here:
I went ahead and created an internal feature request. You can refer to it with DB-I-4250. If you'd like to help prioritize it, please contact your Account Executive or Customer Success Engineer and tell them your company needs DB-I-4250. Being able to attach your company name to feature requests helps the product team prioritize them.
Cheers!
10-19-2021 08:11 AM
That is how it works. When you do 'run all', the cells after the error are skipped.
When you do run all below, they all get executed (and mostly throw an error).
Databricks probably has a reason for this behavior. The notebook experience is not bad but certainly not perfect.
10-19-2021 08:50 AM
Yes- this is expected behavior. Usually customers want all the cells to run regardless if one failed in between. Usually future cells will error our if they are dependent anyway. If you can suggest a feature that will provide that functionality I can make a feature request on your behalf.
10-20-2021 03:12 AM
Thanks all for your responses, just seems strange the behaviour differs and isn't documented publicly (that I can find). Ideally it would be configurable as part of the notebook settings. I've raised a new feature request here : Notebook Settings for handling errors · Community (azure.com)
10-22-2021 07:52 AM
@Laura McGill , this question just came up again here:
I went ahead and created an internal feature request. You can refer to it with DB-I-4250. If you'd like to help prioritize it, please contact your Account Executive or Customer Success Engineer and tell them your company needs DB-I-4250. Being able to attach your company name to feature requests helps the product team prioritize them.
Cheers!
08-15-2022 05:06 AM
I second this request. It's odd that the behaviour is different when running all vs. running all below. Please make it consistent and document properly.
07-25-2024 09:07 AM
Has this been implemented? I have created a job using notebook. My notebook has 6 cells and if the code in first cell fails it should not run the rest of the cells
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group