Databricks Community

Bagger · ‎08-31-2023

Hi,

We need to monitor Databricks jobs and we have made a setup where are able to get the prometheus metrics, however, we are lagging an overview of which metrics refer to what.

Namely, we need to monitor the following:

failed jobs : is a job failed
table ingest rate : how much data is ingested
table ingest lag : is a stream job further behind than expected
table size : size of the current table being ingested into
query runtime : the time a query has been running

Does anyone have any ideas on how to get those metrics (either through Prometheus or an alternative method)?

Bagger · ‎08-31-2023

I have reposted this post in "Administation and Architecture"

Monitoring job metrics - Databricks - 42956

Databricks Community

Monitoring job metrics

Photos

Join Us as a Local Community Builder!

Announcing the APJ Databricks Smart Business Insights Challenge: Empowering Data-Driven Decision Mak

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Business Intelligence in the Era of AI

Virtual Learning Festival: 9 April - 30 April

Data + AI Summit 2025 — registration now open!