โ06-28-2022 06:23 AM
What will happen if a driver node will fail?
What will happen if one of the worker node fails?
Is it same in Spark and Databricks or Databricks provide additional features to overcome these situations?
โ06-28-2022 07:34 AM
If the driver node fails your cluster will fail. If the worker node fails, Databricks will spawn a new worker node to replace the failed node and resumes the workload. Generally it is recommended to assign a on-demand instance for your driver and spot instances as worker nodes.
As for a comparison between Spark and Databricks, please visit our comparison page (https://databricks.com/spark/comparing-databricks-to-apache-spark).
โ06-28-2022 06:46 AM
โ06-28-2022 08:46 AM
So the data is copied on other worker nodes?
Or the data on that worker node is lost?
โ06-28-2022 07:34 AM
If the driver node fails your cluster will fail. If the worker node fails, Databricks will spawn a new worker node to replace the failed node and resumes the workload. Generally it is recommended to assign a on-demand instance for your driver and spot instances as worker nodes.
As for a comparison between Spark and Databricks, please visit our comparison page (https://databricks.com/spark/comparing-databricks-to-apache-spark).
โ06-28-2022 07:43 AM
Good one @Cedric Law Hing Pingโ
โ06-28-2022 08:53 AM
So even if worker node fails between the job. It will resume the job?
And what about the data on the worker node?
Is it lost?
โ06-28-2022 09:15 AM
Yes, the cluster will treat it as a lost worker and schedules the workload to a different worker. Temporary data on the worker will be lost and has to be recomputed by another worker node.
โ06-28-2022 09:27 AM
Alright Thanks
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now