<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Forward Spark structured streaming metrics to Datadog in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/forward-spark-structured-streaming-metrics-to-datadog/m-p/30782#M22349</link>
    <description>&lt;P&gt;We have a spark streaming application written in Pyspark that we'd like to monitor with Datadog. By default, datadog collects a couple of streaming metrics like '&lt;I&gt;spark.structured_streaming.processing_rate&lt;/I&gt;' and '&lt;I&gt;spark.structured_streaming.latency&lt;/I&gt;'. However, after setting '&lt;I&gt;logs_enabled: true&lt;/I&gt;' and '&lt;I&gt;spark.sql.streaming.metricsEnabled = true&lt;/I&gt;' in the cluster init script. We're still unable to see any streaming metrics in datadog. Upon some research, it seems like we need to implement a new class of '&lt;I&gt;StreamingQueryListener&lt;/I&gt;' from spark streaming to make this work. Is this assumption correct? If so, is it possible to implement this in Python instead of Scala? I haven't seen any Python implementation anywhere. I would appreciate it if someone can point me to any example if it's possible. Any help would be appreciated!&lt;/P&gt;</description>
    <pubDate>Mon, 26 Sep 2022 22:34:44 GMT</pubDate>
    <dc:creator>Lizzz</dc:creator>
    <dc:date>2022-09-26T22:34:44Z</dc:date>
    <item>
      <title>Forward Spark structured streaming metrics to Datadog</title>
      <link>https://community.databricks.com/t5/data-engineering/forward-spark-structured-streaming-metrics-to-datadog/m-p/30782#M22349</link>
      <description>&lt;P&gt;We have a spark streaming application written in Pyspark that we'd like to monitor with Datadog. By default, datadog collects a couple of streaming metrics like '&lt;I&gt;spark.structured_streaming.processing_rate&lt;/I&gt;' and '&lt;I&gt;spark.structured_streaming.latency&lt;/I&gt;'. However, after setting '&lt;I&gt;logs_enabled: true&lt;/I&gt;' and '&lt;I&gt;spark.sql.streaming.metricsEnabled = true&lt;/I&gt;' in the cluster init script. We're still unable to see any streaming metrics in datadog. Upon some research, it seems like we need to implement a new class of '&lt;I&gt;StreamingQueryListener&lt;/I&gt;' from spark streaming to make this work. Is this assumption correct? If so, is it possible to implement this in Python instead of Scala? I haven't seen any Python implementation anywhere. I would appreciate it if someone can point me to any example if it's possible. Any help would be appreciated!&lt;/P&gt;</description>
      <pubDate>Mon, 26 Sep 2022 22:34:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/forward-spark-structured-streaming-metrics-to-datadog/m-p/30782#M22349</guid>
      <dc:creator>Lizzz</dc:creator>
      <dc:date>2022-09-26T22:34:44Z</dc:date>
    </item>
    <item>
      <title>Re: Forward Spark structured streaming metrics to Datadog</title>
      <link>https://community.databricks.com/t5/data-engineering/forward-spark-structured-streaming-metrics-to-datadog/m-p/30783#M22350</link>
      <description>&lt;P&gt;@Liz Zhang​&amp;nbsp;, Please refer to the below documentation contain pyspark implementation of streamingQueryListener &lt;/P&gt;&lt;P&gt;&lt;A href="https://www.databricks.com/blog/2022/05/27/how-to-monitor-streaming-queries-in-pyspark.html" target="test_blank"&gt;https://www.databricks.com/blog/2022/05/27/how-to-monitor-streaming-queries-in-pyspark.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 27 Sep 2022 15:48:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/forward-spark-structured-streaming-metrics-to-datadog/m-p/30783#M22350</guid>
      <dc:creator>shan_chandra</dc:creator>
      <dc:date>2022-09-27T15:48:52Z</dc:date>
    </item>
  </channel>
</rss>

