yesterday
We upgraded the GKE cluster to GCE cluster as per the databricks documentation. It works fine on GCE all purpose cluster but gives error while trying to access the databricks managed secret on GCE job cluster. Job is being executed using service principal and it has all the permission as it was working fine on GKE job cluster. Here is the error trace,
: org.apache.http.conn.HttpHostConnectException: Connect to europe-west3.gcp.databricks.com:443 [europe-west3.gcp.databricks.com/34.159.208.230] failed: Connection timed out (Connection timed out) File <command-1511365916692010>, line 1 ----> 1 mongo_prd_user = dbutils.secrets.get(scope="<scope_name>", key="prd_user") 2 mongo_prd_password = dbutils.secrets.get(scope="<scope_name>",key="prd_password")
yesterday
Hi @Sadam97,
if your account uses a customer-managed VPC, you need to manually add a firewall rule to permit traffic between Databricks-managed VMs within your VPC
You can test the connectivity to the Databricks secrets API endpoint from within the GCE job cluster to ensure there are no network issues. Use tools like curl or telnet to check if the endpoint europe-west3.gcp.databricks.com:443 is reachable
yesterday - last edited yesterday
I guess you missed the point that it is working fine on GCE all purpose cluster. Means GCE all purpose cluster is able to access the databricks managed secret. Issue is with GCE job cluster, when it tries to access the secret. And firewall rule was automatically added when i updated the permission as this documentation.
yesterday
Hi @Sadam97,
Thanks for your comments, can you run a connectivity test as I mentioned above? Is it failing intermittently or consistent?
yesterday
Hi @Alberto_Umana ,
Here is response of telnet command on all purpose GCE cluster,
Trying 34.159.208.230...
Connected to europe-west3.gcp.databricks.com.
Escape character is '^]'.
Connection closed by foreign host.
yesterday
Hi @Sadam97,
Thanks for checking in, would you please DIM me your workspaceID and clusterID and can dig deeper on our backend logs. Connection timeout could be due to several factors.
5 hours ago
Hi @Alberto_Umana ,
Here are the requested details,
cluster_id: 6223-152828-9jhx6lo7
workspace_id: 3976272202403488
30m ago
Hi @Sadam97,
As I mentioned in my message, the failure happened because cluster failed to add containers, this could be due to different reasons, therefore asking you to raise a case with us to properly investigate this.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group