cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks cluster's problems at start

PabloSeoaneF
New Contributor III

Hello,

I've spent some time trying to initialize a couple of clusters into an Azure Databricks private environment.

Apparently, without reason they've been causing these two failures:

image.png imageWith the advice of retrying in some minutes. Looking for information I found sometimes this could be caused by network configurations, but the disturbing fact is I'm currently running another cluster in paralel:image 

Could somebody give me more information about these errors?

Regards.

5 REPLIES 5

karthik_p
Esteemed Contributor

@Pablo Seoane Fuenteโ€‹ can you please click on error, and it will display more information from that we can try to sort.

I wrote a new comment with the error details.

Bartek
Contributor

Hi @Pablo Seoane Fuenteโ€‹ ,

you can find more information when you click on specific event in 'Event log' tab (as in attached file).

Without more data it is rather hard to find the root cause - it could be some init scripts that block cluster start, some custom container you selected is not available or anything else.

obraz.png

PabloSeoaneF
New Contributor III

I wrote a new comment with the error details.

PabloSeoaneF
New Contributor III

Hello both!

I attach here the 2 error details:

  • Bootstrap timeout:

Help

Bootstrap Timeout. Please try again later.

Instance bootstrap failed command: BootstrapTimeout

Failure message: Bootstrap script took too long and timeout. Please try again later.

VM extension code: ProvisioningState/succeeded

instanceId: InstanceId(ee4923f738ad4c74859cb6d2fb9a88a5)

workerEnv: workerenv-6705609797558422

Additional details (may be truncated):

[Bootstrap Event] Command DownloadBootstrapScript finished. Storage Account: arprodwesteua6.blob.core.windows.net [SUCCEEDED]. Seconds Elapsed: 87.9898

2022/11/18 12:49:02 INFO vm_bootstrap.py:1073: [Bootstrap Event] Command GetToken finished. [SUCCEEDED]. Seconds Elapsed: 0.000132083892822

2022/11/18 12:49:02 INFO vm_bootstrap.py:1073: [Bootstrap Event] Command GetInstanceId finished. [SUCCEEDED]. Seconds Elapsed: 0.0151960849762

2022/11/18 12:49:04 INFO vm_bootstrap.py:1073: [Bootstrap Event] Command GetRunbook finished. [SUCCEEDED]. Seconds Elapsed: 2.03999614716

2022/11/18 12:49:04 INFO vm_bootstrap.py:1073: [Bootstrap Event] Command Command_MakeLogDir finished. [SUCCEEDED]. Seconds Elapsed: 0.00767803192139

2022/11/18 12:49:20 INFO vm_bootstrap.py:1073: [Bootstrap Event] Command FileDownload_UpdateWorker finished. [SUCCEEDED]. Seconds Elapsed: 15.9508230686

2022/11/18 12:49:20 INFO vm_bootstrap.py:1073: [Bootstrap Event] Command Command_Chmod finished. [SUCCEEDED]. Seconds Elapsed: 0.00655198097229

2022/11/18 12:49:23 INFO vm_bootstrap.py:1073: [Bootstrap Event] Command Command_SetupWorker finished. [SUCCEEDED]. Seconds Elapsed: 2.78266096115

[Bootstrap Event] SELF BOOTSTRAP TIMEOUT. [Started at: Fri Nov 18 12:47:29 UTC 2022] [Seconds Elapsed:600] [Start Timestamp: 1668775649] [End Timestamp: 1668776249]

[Bootstrap Event] Can reach westeurope-c2.azuredatabricks.net: [SUCCEEDED]

  • [Bootstrap Event] Can reach arprodwesteua6.blob.core.windows.net: [SUCCEEDED]

--------------------------------------------------------------------------------------------

  • Spark Image download Failure

Failed to set up spark container due to an image download failure: Exception when downloading spark image:

Stdout:

Stderr: 2022/11/18 13:17:02 INFO worker_common.py:462: Acquiring lock file: /var/lib/lxc/base-images/release__10.4.x-snapshot-scala2.12__databricks-universe__head__97ab960__313f4e7__jenkins__796c0ee__format-2.lock

java.util.concurrent.TimeoutException: Timed out with exception after 23783 attempts

Regards.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group