Hi @DBEnthusiast, In a Spark cluster, the SparkContext object in your main program (the driver program) connects to a cluster manager, which could be Sparkâs standalone cluster manager, Mesos, YARN, or Kubernetes. This cluster manager allocates resources across applications.
Once connected, Spark acquires executors on nodes in the cluster, processes that run computations and stores data for your application. The SparkContext then sends your application code to the executors and tasks to the executors to run. In Databricks, a similar process occurs.
However, Databricks allows you to create a cluster in advance for interactive clusters, and a cluster is created on the fly for job clusters.
Now, to answer your questions:
1. The memory available after installing the docker image would be less than the initial memory. However, the exact amount would depend on the specific docker image and other configurations. Without specific details, it's impossible to provide a precise answer.
2. In Databricks, the resource manager is generally a standalone resource manager.
However, in a general Spark setup, Spark is agnostic to the underlying cluster manager and can work with standalone, Mesos, YARN, or Kubernetes.