cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Delta Live Tables: BAD_REQUEST: Pipeline cluster is not reachable.

jorgemarmol
New Contributor II

Hello community:

 

I donยดt know why my Delta Live table workflow fails into this step. This is the configuration I have for the pipeline:

{
"id": "**",
"pipeline_type": "WORKSPACE",
"clusters": [
{
"label": "default",
"spark_conf": {***},
"num_workers": 3
},
{
"label": "maintenance",
"spark_conf": {****},
"num_workers": 3
}
],
"development": false,
"continuous": false,
"channel": "CURRENT",
"photon": false,
"libraries": [
{
"notebook": {
"path": "/GeneracionAutomatica/DLT Generator (bronze)"
}
},
{
"notebook": {
"path": "/GeneracionAutomatica/DLT Generator (staging)"
}
}
],
"name": "Delta Live Tables (bronze)",
"edition": "ADVANCED",
"storage": "****",
"configuration": {
"pipelines.autoOptimize.zOrderCols": "\"timestamp\"",
"pipelines.numUpdateRetryAttempts": "7"
},
"target": "myrl"
}

 

Note: *** is information confidential.

The result sometimes is this:

jorgemarmol_0-1692258030914.png


But sometimes it works and complete de pipeline!!

 

Anyone has any idea? Thanks!!

 

2 REPLIES 2

Kaniz_Fatma
Community Manager
Community Manager

Hi @jorgemarmolThere could be several reasons why the pipeline cluster is unreachable.

Some of the possible reasons include:

1. Infrastructural issues: The dev clusters are occasionally down due to infrastructural issues. Infra changes are often tested in dev clusters first, which can cause instability and make them unavailable. In such cases, deploys to those clusters will fail, but users can ignore those failures if they have a release that will succeed in staging despite the infra issues in dev.

2. Self Bootstrap Failure or Npip Tunnel Setup Failure: This error indicates that Bootstrap failed due to network connectivity issues between the data and control planes. More specifically, Databricks secure cluster connectivity (SCC) relay is not reachable, or files cannot be downloaded from the artefact bucket due to DNS resolution issues 

3. Spark driver unreachable: This issue can be caused by invalid Spark configurations or malfunctioning init scripts. Also, it can occur if there is no NAT Gateway or egress path to Databricks control plane service available from customer VPC or if the customer VPC has an egress VPC Firewall rule blocking traffic to Control Plane

To troubleshoot these issues, you can download the system log, search "FAILED_MESSAGE", and use a Base64 decode tool to decode the message. The Network Diagnostic Tool can also be used to perform a detailed validation. It checks the network latency, SSL and application layer connectivity, and DNS resolution on the published and specific endpoints. Run the tool in standalone mode, as described in the README bundled with the tool.

Hi @Kaniz_Fatma , thank you so much for your answer!!

Could you tell me how can I download the logs and what is the meaning of Standalone mode, please?I have download the logs from the cluster driver logs but I cant find the "FAILED_MESSAGE":

jorgemarmol_0-1692288509376.pngThank you!!

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!