cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Is there a way to capture the notebook logs from ADF pipeline?

SailajaB
Valued Contributor III

Hi,

I would like to capture notebook custom log exceptions(python) from ADF pipeline based on the exceptions pipeline should got succeed or failed.

Is there any mechanism to implement it. In my testing ADF pipeline is successful irrespective of the log errors.

Notebook always returns SUCCESS do adf's activity, even exception is raised in notebook.If a notebook contains any exceptions then adf pipeline which contains that particular notebook activity should fail

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions

SailajaB
Valued Contributor III

Thank you for your response.

My question is not to store/get the log info.

My scenario is like:

Notebook always returns SUCCESS do adf's activity, even exception is raised in notebook.If a notebook contains any exceptions then adf pipeline which contains that particular notebook activity should fail.

View solution in original post

10 REPLIES 10

Prabakar
Esteemed Contributor III
Esteemed Contributor III

Hi @Sailaja B​  the notebook errors will be tracked in the driver log4j output. You can check the cluster's driver logs to get this information. Or you can set logging to your cluster so that all the messages will be logged in the dbfs or storage path that you provide.

Please refer to the document.

https://docs.databricks.com/clusters/configure.html#cluster-log-delivery-1

SailajaB
Valued Contributor III

Thank you for your response.

My question is not to store/get the log info.

My scenario is like:

Notebook always returns SUCCESS do adf's activity, even exception is raised in notebook.If a notebook contains any exceptions then adf pipeline which contains that particular notebook activity should fail.

Prabakar
Esteemed Contributor III
Esteemed Contributor III

Hi @Sailaja B​  notebook/job fails to happen when there is really a failure. Some exceptions are information that might not hurt the running notebook. To understand better, please share the exception that you see in the notebook output.

SailajaB
Valued Contributor III

Here is the sample code

if not any(mount.mountPoint == "/test/" for mount in dbutils.fs.mounts()):

 dbutils.fs.mount(source = "***",

            mount_point = "/test/",

           extra_configs = configs)

else:

 logger.error("Directory is already mounted")

Note : Mount path is already existed.. If I run this notebook through ADF pipeline, I am expecting that pipeline should fails but it is not getting failed.

Hi @Sailaja B​ ,

You will beed to raise/throw the error exception to stop your Spark execution. Try to use a try..catch statement block handle your custom exceptions.

SailajaB
Valued Contributor III

Hi @Jose Gonzalez​ ,

Thank you for your reply..

It is working as expected with try .. exception..assert False..

Hi @Sailaja B​ Is it working now?I mean, Is pipeline showing failure when notebook failed.If yes please share a sample snippet?(I am also trying the same case .I am able capture logs from pipeline side using output Json but couldn't modify pipeline status)

-werners-
Esteemed Contributor III

Next to the mentioned option, there is also the possibility to analyze logs using Azure Log Analytics:

https://docs.microsoft.com/en-us/azure/databricks/administration-guide/account-settings/azure-diagno...

Hubert-Dudek
Esteemed Contributor III

Also when you catch exception you can just save it anywhere even to Databricks Table something like:

try:
(...)
except Exception as error:
   spark.sql(f"""INSERT INTO   (...)   """", repr(error))
   dbutils.notebook.exit(str(jobId) + ' - ERROR!!! - ' + repr(error))

In my opinion as @werners said is good choice to send to Azure Log Analytics for detailed analysis but I like also to use also above method and just have nice table in databricks with jobs which failed 😉

GurpreetSethi
New Contributor III
New Contributor III

Hi SailajaB,

Try this out.

Notebook, once executed successfully return a long JSON formatted output. We need to specify appropriate nodes to fetch the output.

In below screenshot we can see that when notebook ran it returns empName & empCity as output.

imageTo capture this, we need to:

  • In respective pipeline, add a VARIABLE (to capture output of NOTEBOOK Task)
  • Add a SET VARIABLE activity and use VARIABLE defined in above step and add below expression:

@activity(''YOUR NOTEBOOK ACTIVITY NAME').output.runOutput.an_object.name.value

  • Add link between NOTEBOOK ACTIVITY and SET VARIABLE ACTIVITY
  • Run your pipeline and you should see the output captured in this variable

Note: If you want to specify custom return value then you need to use :

dbutils.notebook.exit('VALUE YOU WANT TO RETURN')

Let me know how it goes.

Cheers

GS

Regards

Gurpreet Singh Sethi
Sr Partner Solution Architect ANZ
+61 0455502323
gurpreet.sethi@databricks.com

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group