Databricks Community

Gim · ‎11-02-2022

What is the best practice for logging in Databricks notebooks?

I have a bunch of notebooks that run in parallel through a workflow. I would like to keep track of everything that happens such as errors coming from a stream. I would like these logs to be maintained somewhere either in DBFS or in a storage account.

I got the built-in logging module working but you have to manually transfer the log file from a temp folder in file: to dbfs:/FileStore/log_folder/text.log. DBFS throws an error if the log file is directly assigned to its path with the FileHandler.

This basically works for my purposes but what is the actual best practice of doing it in Databricks?

Debayan · ‎11-02-2022

Configuring verbose audit logs and configuring audit log delivery can be one of the best practises.

https://docs.databricks.com/administration-guide/account-settings/audit-logs.html

Hubert-Dudek · ‎11-03-2022

Please consider integration of databricks with datadog https://www.datadoghq.com/blog/databricks-monitoring-datadog/

karthik_p · ‎11-03-2022

@Gimwell Young AS @Debayan Mukherjee mentioned if you configure verbose logging in workspace level, logs will be moved to your storage bucket that you have provided during configuration. from there you can pull logs into any of your licensed log monitoring tool like eg: Splunk etc. also same config can be used to monitor unity catalog logs. as @Hubert Dudek mentioned if you configure datadog, you will have graphical view of resources that are being consumed by workspace like no of clusters active, jobs active etc...

Databricks Community

Best practice for logging in Databricks notebooks?

Photos

Join Us as a Local Community Builder!

Announcing the APJ Databricks Smart Business Insights Challenge: Empowering Data-Driven Decision Mak

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Business Intelligence in the Era of AI

Virtual Learning Festival: 9 April - 30 April

Data + AI Summit 2025 — registration now open!