Hi guys, I would like to monitor streaming job on metrics like delay, processing time and more. I found this documentation but I get message on starting and terminating phase and not while I process a record. The job is a pretty easy streaming which read CSV file and write output to the console.
The class listener is the following:
import org.apache.spark.sql.streaming.{StreamingQueryListener}
import org.apache.spark.sql.streaming.StreamingQueryListener._
val listener = new StreamingQueryListener {
def onQueryStarted(event: QueryStartedEvent): Unit = {
println("Starting job!")
}
override def onQueryProgress(event: QueryProgressEvent): Unit = {
println(s"TestStreaming: ${event.progress.durationMs}")
}
def onQueryIdle(event: QueryProgressEvent): Unit = {}
def onQueryTerminated(event: QueryTerminatedEvent): Unit = {
println("Job terminated!")
}
}
Secondly, I would like to customise Log4J by a properties file, but loading the file in Spark's config folder /databricks/spark/conf by init-script.sh it doesn’t work.
Thank you for the support