cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks Monitoring

Revathy123
New Contributor III

Hi Everyone,

can someone suggest me to select best native job monitoring tool available in Databricks for fulfill my need;

we need to monitor the following:

Number of failed jobs and its name : for last 24 hours
Table that are not getting data
Latest ingested timestamp
table ingest rate, how much data is ingested
table ingest lag,  is a stream job further behind than expected
table size, size of the current table being ingested into
query runtime : the time a query has been running
Thanks in advance

1 ACCEPTED SOLUTION

Accepted Solutions

Revathy123
New Contributor III

@Retired_modI an new to this could you please help me to understand how we can achieve all those . Is databricks job API will help me to achieve this?

View solution in original post

5 REPLIES 5

Revathy123
New Contributor III

@Retired_modI an new to this could you please help me to understand how we can achieve all those . Is databricks job API will help me to achieve this?

Can someone help suggest me which monitoring tool helps me and how we can  achieve.

BilalAslamDbrx
Databricks Employee
Databricks Employee

  let me go through these one by one: 

Number of failed jobs and its name : for last 24 hours

[BA] Your best bet will be to use the upcoming system tables integration. This is in preview, I believe. The general idea is that you will get a Delta table with runs and their statuses. For now, you can also use the Job Runs page (Workflows > Job Runs), this will show you job runs and their failures (also accessible by API)

Table that are not getting data
[BA] Lakehouse Monitoring is the way to go! Also in Preview, I believe.

Latest ingested timestamp
table ingest rate, how much data is ingested

[BA] How are you ingesting data?

table ingest lag,  is a stream job further behind than expected

[BA] We are working on this! You will get monitoring and alerting on streaming lag both in Structured Streaming and Delta Live Tables.

table size, size of the current table being ingested into

[BA] I'm not sure what you mean by this.

query runtime : the time a query has been running

[BA] What type of query are you interested in?

Yaadhudbe
New Contributor

You can use the databricks API to collect all required information.. 

https://docs.databricks.com/api/workspace/jobs/list

 

Yaadhudbe
New Contributor

You can use the databricks API to collect all required information.. 

https://docs.databricks.com/api/workspace/jobs/list

Load the output to a delta table. 

Use the Databricks dashboards in displaying this data.. schedule the job for loading the databricks job details as required..

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group