cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Is there a way to capture the notebook logs from ADF pipeline?

SailajaB
Valued Contributor III

Hi,

I would like to capture notebook custom log exceptions(python) from ADF pipeline based on the exceptions pipeline should got succeed or failed.

Is there any mechanism to implement it. In my testing ADF pipeline is successful irrespective of the log errors.

Notebook always returns SUCCESS do adf's activity, even exception is raised in notebook.If a notebook contains any exceptions then adf pipeline which contains that particular notebook activity should fail

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions

SailajaB
Valued Contributor III

Thank you for your response.

My question is not to store/get the log info.

My scenario is like:

Notebook always returns SUCCESS do adf's activity, even exception is raised in notebook.If a notebook contains any exceptions then adf pipeline which contains that particular notebook activity should fail.

View solution in original post

10 REPLIES 10

Prabakar
Esteemed Contributor III
Esteemed Contributor III

Hi @Sailaja B​  the notebook errors will be tracked in the driver log4j output. You can check the cluster's driver logs to get this information. Or you can set logging to your cluster so that all the messages will be logged in the dbfs or storage path that you provide.

Please refer to the document.

https://docs.databricks.com/clusters/configure.html#cluster-log-delivery-1

SailajaB
Valued Contributor III

Thank you for your response.

My question is not to store/get the log info.

My scenario is like:

Notebook always returns SUCCESS do adf's activity, even exception is raised in notebook.If a notebook contains any exceptions then adf pipeline which contains that particular notebook activity should fail.

Prabakar
Esteemed Contributor III
Esteemed Contributor III

Hi @Sailaja B​  notebook/job fails to happen when there is really a failure. Some exceptions are information that might not hurt the running notebook. To understand better, please share the exception that you see in the notebook output.

SailajaB
Valued Contributor III

Here is the sample code

if not any(mount.mountPoint == "/test/" for mount in dbutils.fs.mounts()):

 dbutils.fs.mount(source = "***",

            mount_point = "/test/",

           extra_configs = configs)

else:

 logger.error("Directory is already mounted")

Note : Mount path is already existed.. If I run this notebook through ADF pipeline, I am expecting that pipeline should fails but it is not getting failed.

Hi @Sailaja B​ ,

You will beed to raise/throw the error exception to stop your Spark execution. Try to use a try..catch statement block handle your custom exceptions.

SailajaB
Valued Contributor III

Hi @Jose Gonzalez​ ,

Thank you for your reply..

It is working as expected with try .. exception..assert False..

Hi @Sailaja B​ Is it working now?I mean, Is pipeline showing failure when notebook failed.If yes please share a sample snippet?(I am also trying the same case .I am able capture logs from pipeline side using output Json but couldn't modify pipeline status)

-werners-
Esteemed Contributor III

Next to the mentioned option, there is also the possibility to analyze logs using Azure Log Analytics:

https://docs.microsoft.com/en-us/azure/databricks/administration-guide/account-settings/azure-diagno...

Hubert-Dudek
Esteemed Contributor III

Also when you catch exception you can just save it anywhere even to Databricks Table something like:

try:
(...)
except Exception as error:
   spark.sql(f"""INSERT INTO   (...)   """", repr(error))
   dbutils.notebook.exit(str(jobId) + ' - ERROR!!! - ' + repr(error))

In my opinion as @werners said is good choice to send to Azure Log Analytics for detailed analysis but I like also to use also above method and just have nice table in databricks with jobs which failed 😉

User16826994569
New Contributor III

Hi SailajaB,

Try this out.

Notebook, once executed successfully return a long JSON formatted output. We need to specify appropriate nodes to fetch the output.

In below screenshot we can see that when notebook ran it returns empName & empCity as output.

imageTo capture this, we need to:

  • In respective pipeline, add a VARIABLE (to capture output of NOTEBOOK Task)
  • Add a SET VARIABLE activity and use VARIABLE defined in above step and add below expression:

@activity(''YOUR NOTEBOOK ACTIVITY NAME').output.runOutput.an_object.name.value

  • Add link between NOTEBOOK ACTIVITY and SET VARIABLE ACTIVITY
  • Run your pipeline and you should see the output captured in this variable

Note: If you want to specify custom return value then you need to use :

dbutils.notebook.exit('VALUE YOU WANT TO RETURN')

Let me know how it goes.

Cheers

GS

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.