cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Integration options for Databricks Jobs and DataDog?

JordanYaker
Contributor

I know that there is already the Databricks (technically Spark) integration for DataDog. Unfortunately, that integration only covers the cluster execution itself and that means only Cluster Metrics and Spark Jobs and Tasks. I'm looking for something that will allow me to track metrics about the Databricks Jobs (e.g., Successful Job Runs, Failed Tasks, etc.).

Currently, it seems like my only option would be to use a combination of webhook notifications for the job and custom code inside the task. Unfortunately, this has the following drawbacks:

  1. We use PySpark and if I get a Kernel Unresponsive error, then I'm not going to be able to report metrics with custom code.
  2. The web hook notifications don't produce events that are ready for use as custom metrics within DataDog so I will have to create some sort of custom handler for mapping/enriching the events from the web hook.

I feel like this isn't a unique challenge and that I'm not the only one who would look for something like this, so before I go down the rabbit hole of crafting something home-brew to solve this problem, I thought I would check with the community/Databricks support to see if I'm missing something.

0 REPLIES 0
Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.