Monitoring job metrics

Administration & Architecture

Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.

Hi,

We need to monitor Databricks jobs and we have made a setup where are able to get the prometheus metrics, however, we are lagging an overview of which metrics refer to what.

Namely, we need to monitor the following:

failed jobs : is a job failed
table ingest rate : how much data is ingested
table ingest lag : is a stream job further behind than expected
table size : size of the current table being ingested into
query runtime : the time a query has been running

Does anyone have any ideas on how to get those metrics (either through Prometheus or an alternative method)?

0 REPLIES 0

Photos

Upload Upload
URL URL
Saved Photos Saved Photos

Upload location

Upload location

Add Photos to Album:

New Album

Drag here to start uploading

Drag photos here or

Tap for upload options

You must install or upgrade to the latest version of Adobe Flash Player before you can upload images.