Databricks Community

Pavan578 · ‎10-29-2024

Cluster 'xxxxxxx' was terminated. Reason: WORKER_SETUP_FAILURE (SERVICE_FAULT). Parameters: databricks_error_message:DBFS Daemomn is not reachable., gcp_error_message:Unable to reach the colocated DBFS Daemon.

Can Anyone help me how can we resolve this issue.

agallard · ‎10-29-2024

Hi @Pavan578

To resolve the WORKER_SETUP_FAILURE (SERVICE_FAULT) error, along with the message DBFS Daemon is not reachable on a Databricks cluster running in Google Cloud Platform (GCP), here are some steps you can follow:

Check Databricks Service Status:
- First, check both the Google Cloud Console and the Databricks status portal to see if there are any ongoing issues with Databricks services or resources that could be affecting DBFS connectivity.
Review Cluster Configuration:
- Make sure the cluster is configured correctly to access Google Cloud Storage and DBFS. This includes verifying IAM permissions and access credentials for both Databricks and DBFS.
Adjust Network Settings and Firewalls:
- In some cases, DBFS access issues are due to network permission settings or firewall configurations on GCP. Ensure the workers can reach DBFS and that essential ports (e.g., port 443 for HTTPS) are open.
Update Cluster Image:
- Try updating the cluster image, as recent releases may contain fixes for connectivity and DBFS configuration issues.
Examine Cluster Logs:
- Check the event logs in Databricks for additional error messages on the workers that might give more context about the connectivity failure.
Resource Scaling and Internal Connectivity:
- Make sure the cluster has enough resources allocated, and consider scaling up or out if the workers are overloaded, as this can sometimes prevent proper DBFS connectivity.

Here are some steps I can think of.

Try it out and let us know! 😉

Alfonso Gallardo
-------------------
 I love working with tools like Databricks, Python, Azure, Microsoft Fabric, Azure Data Factory, and other Microsoft solutions, focusing on developing scalable and efficient solutions with Apache Spark