Databricks Community

DBEnthusiast · ‎09-21-2023

Hi All,

I am curious to know the difference between a spark cluster and a DataBricks one.

As per the info I have read Spark Cluster creates driver and Workers when the Application is submitted whereas in Databricks we can create cluster in advance in case of interactive cluster and a cluster is created on the fly for Job cluster

I need to understand what resides inside a worker. As per documentation workers have docker image which has all necessary stuff needed to run a worker but I still have some questions

1. How much is the memory available after docker image is installed . It would definitely be less than the memory available initially as DS3V2 will not have 14GB or close to that

2. What is the Resource Manager in Data bricks ? Seems like its Standalone Resource Manager . Can we change that to YARN or MESOS ?

DBEnthusiast · ‎09-23-2023

Hi @Retired_mod ,

Thanks for your last response

As per my understanding when a user submits an application in spark cluster it specifies how much memory, executors etc it would need .

But in Databricks notebooks we never specify that anywhere. If we have submitted the notebook in a Job cluster how does DataBricks Resource Manager decides how much it will allocate resources to this one

In a cluster having pool I understand we have idle resources which can be allocated as a cluster but still don't understand how much on notebook will be assigned resources

Databricks Community

DataBricks Cluster

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Milestone: DatabricksTV Reaches 100 Videos!

Announcing the new Meta Llama 3.3 model on Databricks

Databricks Community Champion - December 2024 - Sujesh Menon

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences