cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Is there any way to propagate errors from dbutils?

cvantassel
New Contributor III

I have a master notebook that runs a few different notebooks on a schedule using the dbutils.notebook.run() function. Occasionally, these child notebooks will fail (due to API connections or whatever). My issue is, when I attempt to catch the errors with:

try:
    dbutils.notebook.run(notebook_path, timeout_seconds=0)
except Exception as e:
    print(e)

The error is always the same regardless of the notebook/failure point:

An error occurred while calling o8701._run.\n: com.databricks.WorkflowException: com.databricks.NotebookExecutionException: FAILED\n\tat com.databricks.workflow.WorkflowDriver.run(WorkflowDriver.scala:98)\n\tat com.databricks.dbutils_v1.impl.NotebookUtilsImpl.run(NotebookUtilsImpl.scala:134)\n\tat com.databricks.dbutils_v1.impl.NotebookUtilsImpl._run(NotebookUtilsImpl.scala:96)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n\tat py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)\n\tat py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)\n\tat py4j.Gateway.invoke(Gateway.java:295)\n\tat py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)\n\tat py4j.commands.CallCommand.execute(CallCommand.java:79)\n\tat py4j.GatewayConnection.run(GatewayConnection.java:251)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: com.databricks.NotebookExecutionException: FAILED\n\tat com.databricks.workflow.WorkflowDriver.run0(WorkflowDriver.scala:146)\n\tat com.databricks.workflow.WorkflowDriver.run(WorkflowDriver.scala:93)\n\t... 13 more\n

It would be useful to capture the actual error that occurred in the notebook, rather than the one that just indicates that it failed.

I understand I could catch any exceptions and propagate them using the dbutils.notebook.exit() function, but I'd rather not have to wrap every potential issue in a try-except.

Is there a better way to capture the errors that occur in a child notebook?

7 REPLIES 7

-werners-
Esteemed Contributor III

May I suggest another way of working?

You could use workflows or schedule the notebooks in Glue/Data Factory.

The difference is that not all the notebooks run on the same cluster (compared to your setup).

I don´t know if that is an option?

cvantassel
New Contributor III

Thanks for your suggestion, @werners, but that unfortunately won't work. 

We originally did have our jobs all scheduled separately, but the growing number of them made things messy since you need to click through the UI to find the jobs, then again to find the errors.

We're now trying to build a framework that can log runs into a table automatically so we can have all that information in one place. It would be mighty helpful if we could also capture what errors occurred so we can recognize the type of error without needing to sift though the UI.

Have you try to use a custom logger to capture these error message?

A custom logger would work, but we were hoping for a solution that didn't require us to write specific code in every notebook since the scheduler will be used across teams.

Anonymous
Not applicable

Hey there @Caleb Van Tassel​ 

Hope all is well!

Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

cvantassel
New Contributor III

Unfortunately, we haven't been able to resolve this. It seems like we're stuck either manually clicking through notebooks, or specifically writing code every time we want an error to persist. Is there a place I can make a feature request? It would be very helpful if Databricks supported catching specific errors in notebooks using dbutils, rather than just throwing the placeholder WorkflowException.

wdphilli
New Contributor III

I have the same issue. I see no reason that Databricks couldn't propagate the internal exception back through their WorkflowException

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.