cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Cluster is not starting

Pavan578
New Contributor II

Cluster 'xxxxxxx' was terminated. Reason: WORKER_SETUP_FAILURE (SERVICE_FAULT). Parameters: databricks_error_message:DBFS Daemomn is not reachable., gcp_error_message:Unable to reach the colocated DBFS Daemon.

Can Anyone help me how can we resolve this issue.

2 REPLIES 2

agallard
New Contributor III

Hi @Pavan578 

To resolve the WORKER_SETUP_FAILURE (SERVICE_FAULT) error, along with the message DBFS Daemon is not reachable on a Databricks cluster running in Google Cloud Platform (GCP), here are some steps you can follow:

  1. Check Databricks Service Status:

    • First, check both the Google Cloud Console and the Databricks status portal to see if there are any ongoing issues with Databricks services or resources that could be affecting DBFS connectivity.
  2. Review Cluster Configuration:

    • Make sure the cluster is configured correctly to access Google Cloud Storage and DBFS. This includes verifying IAM permissions and access credentials for both Databricks and DBFS.
  3. Adjust Network Settings and Firewalls:

    • In some cases, DBFS access issues are due to network permission settings or firewall configurations on GCP. Ensure the workers can reach DBFS and that essential ports (e.g., port 443 for HTTPS) are open.
  4. Update Cluster Image:

    • Try updating the cluster image, as recent releases may contain fixes for connectivity and DBFS configuration issues.
  5. Examine Cluster Logs:

    • Check the event logs in Databricks for additional error messages on the workers that might give more context about the connectivity failure.
  6. Resource Scaling and Internal Connectivity:

    • Make sure the cluster has enough resources allocated, and consider scaling up or out if the workers are overloaded, as this can sometimes prevent proper DBFS connectivityโ€‹โ€‹.

Here are some steps I can think of.

Try it out and let us know! ๐Ÿ˜‰

Alfonso Gallardo
-------------------
๏”ง I love working with tools like Databricks, Python, Azure, Microsoft Fabric, Azure Data Factory, and other Microsoft solutions, focusing on developing scalable and efficient solutions with Apache Spark

Pavan578
New Contributor II

Thanks @agallard . I will check the above steps and let you know.

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group