<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Can Spark History server be created in Databricks? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/can-spark-history-server-be-created-in-databricks/m-p/29211#M20957</link>
    <description>&lt;P&gt;We have a Spark pipeline producing more than 3k Spark jobs. After the pipeline finishes and the cluster shuts down, only a subset (&amp;lt;1k) of these can be recovered from the Spark UI.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;We would like to have access to the full Spark UI after the pipeline terminated and the cluster shut down. This is for performance monitoring purposes. Is it possible to deploy a Spark History Server in Databricks? If not, what is your recommended approach?&lt;/P&gt;</description>
    <pubDate>Wed, 05 Oct 2022 06:46:57 GMT</pubDate>
    <dc:creator>vladcrisan</dc:creator>
    <dc:date>2022-10-05T06:46:57Z</dc:date>
    <item>
      <title>Can Spark History server be created in Databricks?</title>
      <link>https://community.databricks.com/t5/data-engineering/can-spark-history-server-be-created-in-databricks/m-p/29211#M20957</link>
      <description>&lt;P&gt;We have a Spark pipeline producing more than 3k Spark jobs. After the pipeline finishes and the cluster shuts down, only a subset (&amp;lt;1k) of these can be recovered from the Spark UI.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;We would like to have access to the full Spark UI after the pipeline terminated and the cluster shut down. This is for performance monitoring purposes. Is it possible to deploy a Spark History Server in Databricks? If not, what is your recommended approach?&lt;/P&gt;</description>
      <pubDate>Wed, 05 Oct 2022 06:46:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-spark-history-server-be-created-in-databricks/m-p/29211#M20957</guid>
      <dc:creator>vladcrisan</dc:creator>
      <dc:date>2022-10-05T06:46:57Z</dc:date>
    </item>
    <item>
      <title>Re: Can Spark History server be created in Databricks?</title>
      <link>https://community.databricks.com/t5/data-engineering/can-spark-history-server-be-created-in-databricks/m-p/29213#M20959</link>
      <description>&lt;P&gt;It depends on what data you need. It can be good to integrate with datadog &lt;A href="https://www.datadoghq.com/blog/databricks-monitoring-datadog/" target="test_blank"&gt;https://www.datadoghq.com/blog/databricks-monitoring-datadog/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;You can also redirect logs to Azure analitycs.&lt;/P&gt;</description>
      <pubDate>Thu, 13 Oct 2022 15:36:40 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-spark-history-server-be-created-in-databricks/m-p/29213#M20959</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-10-13T15:36:40Z</dc:date>
    </item>
    <item>
      <title>Re: Can Spark History server be created in Databricks?</title>
      <link>https://community.databricks.com/t5/data-engineering/can-spark-history-server-be-created-in-databricks/m-p/29214#M20960</link>
      <description>&lt;P&gt;Hi @Debayan Mukherjee​&amp;nbsp;my question is about the total number of Spark jobs on one cluster and how these can be retrieved in the Spark UI after the cluster shuts down, rather than the number of concurrent jobs. Concrete example: if a notebook running on a cluster produces (e.g. sequentially) in total 3k jobs, after the underlying cluster shuts down, I would only be able to see in the Spark UI approximately 1k jobs. My question is if there is a way to recover all jobs in the Spark UI.&lt;/P&gt;</description>
      <pubDate>Mon, 17 Oct 2022 13:56:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-spark-history-server-be-created-in-databricks/m-p/29214#M20960</guid>
      <dc:creator>vladcrisan</dc:creator>
      <dc:date>2022-10-17T13:56:32Z</dc:date>
    </item>
    <item>
      <title>Re: Can Spark History server be created in Databricks?</title>
      <link>https://community.databricks.com/t5/data-engineering/can-spark-history-server-be-created-in-databricks/m-p/29215#M20961</link>
      <description>&lt;P&gt;I would like to recover all information displayed in the Spark UI. Datadog is a good suggestion, but unfortunately we can't use external services in our application. Azure analytics would be an option, but I couldn't find any reference showing how to recover the full Spark UI through it.&lt;/P&gt;</description>
      <pubDate>Mon, 17 Oct 2022 13:59:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-spark-history-server-be-created-in-databricks/m-p/29215#M20961</guid>
      <dc:creator>vladcrisan</dc:creator>
      <dc:date>2022-10-17T13:59:32Z</dc:date>
    </item>
    <item>
      <title>Re: Can Spark History server be created in Databricks?</title>
      <link>https://community.databricks.com/t5/data-engineering/can-spark-history-server-be-created-in-databricks/m-p/29216#M20962</link>
      <description>&lt;P&gt;@Vlad Crisan​&amp;nbsp;, you can use the Databricks clusters to replay the events. Please follow this kb: &lt;A href="https://kb.databricks.com/clusters/replay-cluster-spark-events" target="test_blank"&gt;https://kb.databricks.com/clusters/replay-cluster-spark-events&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Note: Please spin up a cluster with version 10.4 LTS.&lt;/P&gt;</description>
      <pubDate>Fri, 02 Jun 2023 10:29:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-spark-history-server-be-created-in-databricks/m-p/29216#M20962</guid>
      <dc:creator>Sandeep</dc:creator>
      <dc:date>2023-06-02T10:29:54Z</dc:date>
    </item>
    <item>
      <title>Re: Can Spark History server be created in Databricks?</title>
      <link>https://community.databricks.com/t5/data-engineering/can-spark-history-server-be-created-in-databricks/m-p/29212#M20958</link>
      <description>&lt;P&gt;Hi @Vlad Crisan​&amp;nbsp;, As of now, A workspace is limited to 1000 concurrent job runs. A 429 Too Many Requests response is returned when you request a run that cannot start immediately. The number of jobs a workspace can create in an hour is limited to 10000 (includes “runs submit”).&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/workflows/jobs/jobs.html#create-run-and-manage-databricks-jobs" alt="https://docs.databricks.com/workflows/jobs/jobs.html#create-run-and-manage-databricks-jobs" target="_blank"&gt;https://docs.databricks.com/workflows/jobs/jobs.html#create-run-and-manage-databricks-jobs&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To increase the jobs limit in a workspace you can refer below:&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/administration-guide/workspace/enable-increased-jobs-limit.html" alt="https://docs.databricks.com/administration-guide/workspace/enable-increased-jobs-limit.html" target="_blank"&gt;https://docs.databricks.com/administration-guide/workspace/enable-increased-jobs-limit.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 06 Oct 2022 07:08:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-spark-history-server-be-created-in-databricks/m-p/29212#M20958</guid>
      <dc:creator>Debayan</dc:creator>
      <dc:date>2022-10-06T07:08:28Z</dc:date>
    </item>
  </channel>
</rss>

