Issue: Spark structured streaming application
After adding the listener jar file in the cluster init script, the listener is working (From what I see in the stdout/log4j logs)
But when I try to hit the
'Content-Type: application/json' http://host:port/api/v1/applications/app-id/streaming/statistics end point
It is showing no streaming listener attached to spark application
Details:
We have created a jar file with the class file containing below code and have made it available to cluster using below shell script
Shell script:
#!/bin/bash
cp /dbfs/FileStore/jars/my_jar.jar /databricks/jars
mySparkListener class:
import org.apache.spark.sql.streaming.StreamingQueryListener
import org.apache.spark.sql.streaming.StreamingQueryListener._
import org.apache.spark.sql.streaming.StreamingQueryProgress
import org.apache.log4j.Logger
import org.joda.time.DateTime
import scala.collection.JavaConverters._
class mySparkListener extends StreamingQueryListener {
override def onQueryStarted(queryStarted: QueryStartedEvent): Unit = {
println("Query started: " + queryStarted.id)
}
override def onQueryTerminated(queryTerminated: QueryTerminatedEvent): Unit = {
println("Query terminated: " + queryTerminated.id)
}
override def onQueryProgress(queryProgress: QueryProgressEvent): Unit = {
println("Query made progress: " + queryProgress.progress)
}
}
val listener = new mySparkListener()
spark.streams.addListener(listener)
Note: I have already added following configs to the cluster
- spark.sql.streaming.metricsEnabled true
- *.sink.servlet.class org.apache.spark.metrics.sink.MetricsServlet
- *.sink.servlet.path /metrics/json
- master.sink.servlet.path /metrics/master/json
- applications.sink.servlet.path /metrics/applications/json