<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Differences between Spark Cluster Manager and Databricks Cluster Manager? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/differences-between-spark-cluster-manager-and-databricks-cluster/m-p/29990#M21677</link>
    <description>&lt;P&gt;Hi @John William​&amp;nbsp;&lt;/P&gt;&lt;P&gt;Databricks clusters use Spark's&amp;nbsp;&lt;A href="https://spark.apache.org/docs/latest/spark-standalone.html" alt="https://spark.apache.org/docs/latest/spark-standalone.html" target="_blank"&gt;Standalone cluster manager&lt;/A&gt;. Each Databricks cluster has its own standalone Master and Worker processes run inside of the LXC containers and share a lifecycle with the cluster. Each cluster has a single Driver process, which acts as the sole Spark application for the standalone cluster. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Here is the official Spark Standalone cluster mode doc: &lt;A href="https://spark.apache.org/docs/latest/spark-standalone.html" target="test_blank"&gt;https://spark.apache.org/docs/latest/spark-standalone.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 30 Sep 2022 12:58:22 GMT</pubDate>
    <dc:creator>User16752242622</dc:creator>
    <dc:date>2022-09-30T12:58:22Z</dc:date>
    <item>
      <title>Differences between Spark Cluster Manager and Databricks Cluster Manager?</title>
      <link>https://community.databricks.com/t5/data-engineering/differences-between-spark-cluster-manager-and-databricks-cluster/m-p/29989#M21676</link>
      <description>&lt;P&gt;I didn't found any documentation on Databricks Cluster Manager. Could anyone give me some resources on this topic?&lt;/P&gt;</description>
      <pubDate>Fri, 30 Sep 2022 07:56:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/differences-between-spark-cluster-manager-and-databricks-cluster/m-p/29989#M21676</guid>
      <dc:creator>jwilliam</dc:creator>
      <dc:date>2022-09-30T07:56:28Z</dc:date>
    </item>
    <item>
      <title>Re: Differences between Spark Cluster Manager and Databricks Cluster Manager?</title>
      <link>https://community.databricks.com/t5/data-engineering/differences-between-spark-cluster-manager-and-databricks-cluster/m-p/29990#M21677</link>
      <description>&lt;P&gt;Hi @John William​&amp;nbsp;&lt;/P&gt;&lt;P&gt;Databricks clusters use Spark's&amp;nbsp;&lt;A href="https://spark.apache.org/docs/latest/spark-standalone.html" alt="https://spark.apache.org/docs/latest/spark-standalone.html" target="_blank"&gt;Standalone cluster manager&lt;/A&gt;. Each Databricks cluster has its own standalone Master and Worker processes run inside of the LXC containers and share a lifecycle with the cluster. Each cluster has a single Driver process, which acts as the sole Spark application for the standalone cluster. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Here is the official Spark Standalone cluster mode doc: &lt;A href="https://spark.apache.org/docs/latest/spark-standalone.html" target="test_blank"&gt;https://spark.apache.org/docs/latest/spark-standalone.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 30 Sep 2022 12:58:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/differences-between-spark-cluster-manager-and-databricks-cluster/m-p/29990#M21677</guid>
      <dc:creator>User16752242622</dc:creator>
      <dc:date>2022-09-30T12:58:22Z</dc:date>
    </item>
    <item>
      <title>Re: Differences between Spark Cluster Manager and Databricks Cluster Manager?</title>
      <link>https://community.databricks.com/t5/data-engineering/differences-between-spark-cluster-manager-and-databricks-cluster/m-p/29991#M21678</link>
      <description>&lt;P&gt;Hi @Akash Bhat​&amp;nbsp;, thank you for your reply. I really surprise that Databricks clusters use Spark's Standalone cluster manager because if I read correctly here, Databricks uses Kubernnetes as cluster manager &lt;A href="https://www.databricks.com/blog/2021/08/06/how-we-built-databricks-on-google-kubernetes-engine-gke.html" alt="https://www.databricks.com/blog/2021/08/06/how-we-built-databricks-on-google-kubernetes-engine-gke.html" target="_blank"&gt;https://www.databricks.com/blog/2021/08/06/how-we-built-databricks-on-google-kubernetes-engine-gke.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 05 Oct 2022 07:35:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/differences-between-spark-cluster-manager-and-databricks-cluster/m-p/29991#M21678</guid>
      <dc:creator>jwilliam</dc:creator>
      <dc:date>2022-10-05T07:35:23Z</dc:date>
    </item>
    <item>
      <title>Re: Differences between Spark Cluster Manager and Databricks Cluster Manager?</title>
      <link>https://community.databricks.com/t5/data-engineering/differences-between-spark-cluster-manager-and-databricks-cluster/m-p/29992#M21679</link>
      <description>&lt;P&gt;Hi @John William​&amp;nbsp;&lt;/P&gt;&lt;P&gt;The cluster manager launches worker instances and starts worker services&lt;/P&gt;&lt;P&gt;The cluster manager issues API calls to a cloud provider (AWS or Azure) in order to obtain these instances for a cluster.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Whereas Databricks on GCP maintains a Google's Kubernetes Engine (GKE) node pools for provisioning the driver node and the executor nodes &lt;/P&gt;</description>
      <pubDate>Thu, 06 Oct 2022 18:32:15 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/differences-between-spark-cluster-manager-and-databricks-cluster/m-p/29992#M21679</guid>
      <dc:creator>User16752242622</dc:creator>
      <dc:date>2022-10-06T18:32:15Z</dc:date>
    </item>
  </channel>
</rss>

