cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Integration options for Databricks Jobs and DataDog?

JordanYaker
Contributor

I know that there is already the Databricks (technically Spark) integration for DataDog. Unfortunately, that integration only covers the cluster execution itself and that means only Cluster Metrics and Spark Jobs and Tasks. I'm looking for something that will allow me to track metrics about the Databricks Jobs (e.g., Successful Job Runs, Failed Tasks, etc.).

Currently, it seems like my only option would be to use a combination of webhook notifications for the job and custom code inside the task. Unfortunately, this has the following drawbacks:

  1. We use PySpark and if I get a Kernel Unresponsive error, then I'm not going to be able to report metrics with custom code.
  2. The web hook notifications don't produce events that are ready for use as custom metrics within DataDog so I will have to create some sort of custom handler for mapping/enriching the events from the web hook.

I feel like this isn't a unique challenge and that I'm not the only one who would look for something like this, so before I go down the rabbit hole of crafting something home-brew to solve this problem, I thought I would check with the community/Databricks support to see if I'm missing something.

0 REPLIES 0