cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

REST API for Stream Monitoring

Baldrez
New Contributor II

Hi, everyone. I just recently started using Databricks on Azure so my question is probably very basic but I am really stuck right now.

I need to capture some streaming metrics (number of input rows and their time) so I tried using the Spark Rest Api , however I get the following error: "no streaming listener attached to Databricks Shell". I tried different solutions I have seen in videos or tutorials but none have worked so far (this only happens when I try to get the stream statistics, if I use the API for jobs or stages, I get the json as expected).

Here is the code I am trying to run:

import requests
import json
 
driverIp = spark.conf.get('spark.driver.host')
port = spark.conf.get("spark.ui.port")
temp_url = F"http://{driverIp}:{port}/api/v1/applications"
temp_r = request.get(temp_url, timeout=10.0)
content_r = json.load(temp_r.content)
app_id = content_r[0][ïd"]
 
url = F"http://{driverIp}:{port}/api/v1/applications/{app_id}/streaming/statistics"
r = requests.get(url)
print(r.content)

I understand that I should attach the streaming listener in order to get the metrics I need but I still did not understand how to implement it in the code. Could someone please help me on this issue?

Thanks a lot in advance

1 ACCEPTED SOLUTION

Accepted Solutions

User16763506477
Contributor III

Hi @Roberto Baldrez​ , you will need to add the below configs to the cluster

spark.sql.streaming.metricsEnabled true
*.sink.servlet.class org.apache.spark.metrics.sink.MetricsServlet
*.sink.servlet.path /metrics/json
master.sink.servlet.path /metrics/master/json
applications.sink.servlet.path /metrics/applications/json

URL will change to "http://<driverIP>:<port>/metrics/json/" the one you mentioned is for DStream application

note: This gives limited streaming metrics. If you need all metrics you will need to add metrics sink to the cluster.

More info

View solution in original post

4 REPLIES 4

Anonymous
Not applicable

Hi @Roberto Baldrez​ - My name is Piper and I'm one of the community moderators. Thanks for your question. Let's give it a bit to see what the community says. Thank you for your patience.

User16763506477
Contributor III

Hi @Roberto Baldrez​ , you will need to add the below configs to the cluster

spark.sql.streaming.metricsEnabled true
*.sink.servlet.class org.apache.spark.metrics.sink.MetricsServlet
*.sink.servlet.path /metrics/json
master.sink.servlet.path /metrics/master/json
applications.sink.servlet.path /metrics/applications/json

URL will change to "http://<driverIP>:<port>/metrics/json/" the one you mentioned is for DStream application

note: This gives limited streaming metrics. If you need all metrics you will need to add metrics sink to the cluster.

More info

Could you please tell us where is the configs to the cluster? I cannot find it. Thanks.

jose_gonzalez
Moderator
Moderator

hi @Roberto Baldrez​ ,

if you think that @Gaurav Rupnar​ solved your question, then please select it as best response to it can be moved to the top of the topic and it will help more users in the future.

Thank you

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.