cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Monitoring job metrics

Bagger
New Contributor II

Hi,

We need to monitor Databricks jobs and we have made a setup where are able to get the prometheus metrics, however, we are lagging an overview of which metrics refer to what.

Namely, we need to monitor the following:

  • failed jobs : is a job failed
  • table ingest rate : how much data is ingested
  • table ingest lag : is a stream job further behind than expected
  • table size : size of the current table being ingested into
  • query runtime : the time a query has been running

Does anyone have any ideas on how to get those metrics (either through Prometheus or an alternative method)?

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group