cancel
Showing results forĀ 
Search instead forĀ 
Did you mean:Ā 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forĀ 
Search instead forĀ 
Did you mean:Ā 

How to get Spark run-time and structured metrics before job completion?

saicharandeepb
New Contributor II

Hi all,

I’m trying to get Spark run-time metrics and structured streaming metrics by enabling cluster logging and now I see the following folders:

saicharandeepb_0-1757940030183.png

What I noticed is that the eventlog folder only gets populated after a job has completed. That makes it difficult to calculate metrics in near real-time.

Is there a common parser or recommended approach to read from the driver and executor logs so that I can compute these metrics while the job is still running, rather than only after completion?

Thanks in advance for your guidance!

2 REPLIES 2

Isi
Honored Contributor II

Hello @saicharandeepb 

I would recommend to use Gist by rayalex 

It integrates EC2 Alloy with Prometheus and Grafana, allowing you to capture and visualize Spark run-time and structured streaming metrics in near real-time.

It’s not a solution natively integrated in Databricks (since, as far as I know, runtime-level access is restricted), but I think it’s a very solid approach if your goal is to collect this information and display it in a dashboard.

Hope this helps šŸ™‚

Isi

ManojkMohan
Contributor III

I would recommend the following approaches 

MethodReal-Time?ComplexityTypical Use Case
SparkListener / QueryListenerYes ModerateJob/stage/batch metrics live
Custom Metrics SourceYes (live)More AdvancedFine-grained, app-specific
Metrics Sinks YesEasy/ModExternal dashboard/monitoring

Example or External Prometheus sink: 

package org.apache.spark.metrics.source

import com.codahale.metrics.{MetricRegistry, SettableGauge}
import org.apache.spark.SparkEnv
import org.apache.spark.sql.streaming.StreamingQueryListener

object MyCustomSource extends Source {
override def sourceName: String = "MyCustomSource"
override val metricRegistry: MetricRegistry = new MetricRegistry
val MY_METRIC_A: SettableGauge[Long] = metricRegistry.gauge(MetricRegistry.name("a"))

class MyListener extends StreamingQueryListener {
override def onQueryProgress(event: StreamingQueryListener.QueryProgressEvent): Unit = {
MyCustomSource.MY_METRIC_A.setValue(event.progress.batchId)
}
}

def apply(): MyListener = {
SparkEnv.get.metricsSystem.registerSource(MyCustomSource)
new MyListener()
}
}

// Register in your Spark app:
spark.streams.addListener(MyCustomSource())

This exposes custom metrics (here, batchId) to Spark’s metrics system for integration with Prometheus, Grafana

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now