Monitoring job metrics
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-31-2023 03:19 AM
Hi,
We need to monitor Databricks jobs and we have made a setup where are able to get the prometheus metrics, however, we are lagging an overview of which metrics refer to what.
Namely, we need to monitor the following:
- failed jobs : is a job failed
- table ingest rate : how much data is ingested
- table ingest lag : is a stream job further behind than expected
- table size : size of the current table being ingested into
- query runtime : the time a query has been running
Does anyone have any ideas on how to get those metrics (either through Prometheus or an alternative method)?
0 REPLIES 0