cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Databricks Monitoring

Revathy123
New Contributor III

Hi Everyone,

can someone suggest me to select best native job monitoring tool available in Databricks for fulfill my need;

we need to monitor the following:

Number of failed jobs and its name : for last 24 hours
Table that are not getting data
Latest ingested timestamp
table ingest rate, how much data is ingested
table ingest lag,  is a stream job further behind than expected
table size, size of the current table being ingested into
query runtime : the time a query has been running
Thanks in advance

2 ACCEPTED SOLUTIONS

Accepted Solutions

Kaniz_Fatma
Community Manager
Community Manager

Hi @Revathy123 , Databricks provides robust functionality for monitoring custom application metrics, streaming query events, and application log messages.


- Databricks offers native job monitoring tools.
- Databricks Monitoring allows you to track various metrics and events related to your jobs.
- You can monitor the number of failed jobs and their names.
- You can monitor tables that are not receiving data.
- Databricks provides metrics to track the latest ingested timestamp.
- You can monitor the rate at which data is being ingested into tables.
- Databricks allows you to monitor if a stream job is falling behind the expected ingest rate.
- You can monitor the size of the current table being ingested into.
- Databricks provides metrics to track the runtime of queries.
- You can send monitoring data from Databricks to Azure Monitor.
- Azure Monitor provides advanced monitoring and alerting capabilities for Databricks applications.
- If you are using Databricks on AWS, you can integrate it with CloudWatch for monitoring.
- CloudWatch enables you to derive metrics from logs and set up alerts.

View solution in original post

Revathy123
New Contributor III

@Kaniz_FatmaI an new to this could you please help me to understand how we can achieve all those . Is databricks job API will help me to achieve this?

View solution in original post

4 REPLIES 4

Kaniz_Fatma
Community Manager
Community Manager

Hi @Revathy123 , Databricks provides robust functionality for monitoring custom application metrics, streaming query events, and application log messages.


- Databricks offers native job monitoring tools.
- Databricks Monitoring allows you to track various metrics and events related to your jobs.
- You can monitor the number of failed jobs and their names.
- You can monitor tables that are not receiving data.
- Databricks provides metrics to track the latest ingested timestamp.
- You can monitor the rate at which data is being ingested into tables.
- Databricks allows you to monitor if a stream job is falling behind the expected ingest rate.
- You can monitor the size of the current table being ingested into.
- Databricks provides metrics to track the runtime of queries.
- You can send monitoring data from Databricks to Azure Monitor.
- Azure Monitor provides advanced monitoring and alerting capabilities for Databricks applications.
- If you are using Databricks on AWS, you can integrate it with CloudWatch for monitoring.
- CloudWatch enables you to derive metrics from logs and set up alerts.

Revathy123
New Contributor III

@Kaniz_FatmaI an new to this could you please help me to understand how we can achieve all those . Is databricks job API will help me to achieve this?

Can someone help suggest me which monitoring tool helps me and how we can  achieve.

BilalAslamDbrx
Honored Contributor III
Honored Contributor III

  let me go through these one by one: 

Number of failed jobs and its name : for last 24 hours

[BA] Your best bet will be to use the upcoming system tables integration. This is in preview, I believe. The general idea is that you will get a Delta table with runs and their statuses. For now, you can also use the Job Runs page (Workflows > Job Runs), this will show you job runs and their failures (also accessible by API)

Table that are not getting data
[BA] Lakehouse Monitoring is the way to go! Also in Preview, I believe.

Latest ingested timestamp
table ingest rate, how much data is ingested

[BA] How are you ingesting data?

table ingest lag,  is a stream job further behind than expected

[BA] We are working on this! You will get monitoring and alerting on streaming lag both in Structured Streaming and Delta Live Tables.

table size, size of the current table being ingested into

[BA] I'm not sure what you mean by this.

query runtime : the time a query has been running

[BA] What type of query are you interested in?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group