cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Best practice for logging in Databricks notebooks?

Gim
Contributor

What is the best practice for logging in Databricks notebooks?

I have a bunch of notebooks that run in parallel through a workflow. I would like to keep track of everything that happens such as errors coming from a stream. I would like these logs to be maintained somewhere either in DBFS or in a storage account.

I got the built-in logging module working but you have to manually transfer the log file from a temp folder in file: to dbfs:/FileStore/log_folder/text.log. DBFS throws an error if the log file is directly assigned to its path with the FileHandler.

This basically works for my purposes but what is the actual best practice of doing it in Databricks?

5 REPLIES 5

Debayan
Esteemed Contributor III
Esteemed Contributor III

Configuring verbose audit logs and configuring audit log delivery can be one of the best practises.

https://docs.databricks.com/administration-guide/account-settings/audit-logs.html

Hubert-Dudek
Esteemed Contributor III

Please consider integration of databricks with datadog https://www.datadoghq.com/blog/databricks-monitoring-datadog/

Kaniz
Community Manager
Community Manager

Hi @Gimwell Young​ ​, We haven’t heard from you since the last response from @Hubert Dudek​ and @Debayan Mukherjee​, and I was checking back to see if you have a resolution yet.

If you have any solution, please share it with the community, as it can be helpful to others. Otherwise, we will respond with more details and try to help.

Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

karthik_p
Esteemed Contributor

@Gimwell Young​ AS @Debayan Mukherjee​ mentioned if you configure verbose logging in workspace level, logs will be moved to your storage bucket that you have provided during configuration. from there you can pull logs into any of your licensed log monitoring tool like eg: Splunk etc. also same config can be used to monitor unity catalog logs. as @Hubert Dudek​ mentioned if you configure datadog, you will have graphical view of resources that are being consumed by workspace like no of clusters active, jobs active etc...

Kaniz
Community Manager
Community Manager

Hi @Gimwell Young​ , We haven’t heard from you on the last response from @karthik p​​, @Debayan Mukherjee​ and @Hubert Dudek​  and I was checking back to see if their suggestions helped you.

Or else, If you have any solution, please do share that with the community as it can be helpful to others.

Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.