cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Observability and monitoring accross multiple workspaces(both job clusters and serverless compute)

sparkycloud
Visitor

Hi all, 

Today what are the best option available today for observability and monitoring databricks jobs accross all workspaces. We have 100 of workspaces and it hard to do monitoring to check failed and successeded jobs.

We tried using: 

1. Team webhook to notify ourselves if there are any errors but its not very scalable

2. Grafana and Datadog but they are limited with init script which is no more the option on serverless compute.

3. System tables(compute and job timeline) but they lack the capability of showing resource usage metrics.

4. Databricks Workflow UI : its limited to one workspace so not scalable.

What we want to have: 

1. Overview of Jobs failed or success across all workspaces 

2. Get failure alerts and easy to navigate to application logs. 

3. Good to have email alerts.

4. Its supports serverless compute. 

Thanks in advance! 

Best REgards,

sunny

0 REPLIES 0

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group