cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Notebook cells stuck on "waiting to run" when using Cluster Libraries

jannikj
New Contributor III

Hey,

we're observing the following problem when trying to run a notebook on a cluster with libraries installed:
All notebook cells are stuck in "Waiting to run" (also ones containing only a '1+1' statement).

When running the cluster without installing the packages, notebook execution works fine.

We're using a private setup with public network access disabled in Azure Databricks.

We tried this on both a Shared Cluster and a Single User Cluster.
The runtime version is 13.3 LTS with Worker/Driver Type of Standard_DS3_v2 and 2 workers.
Termination is configured after 30 minutes of inactivity.

 The packages we are trying to install are spark-xml and a custom written plain python package without dependencies. Both packages are uploaded as files to the workspace (for the Single User Cluster) and in a Unity Catalog Volume (for the Shared Cluster).

In the Event log, the clusters show the expected STARTING -> RUNNING -> DRIVER HEALTHY messages. The checkmarks in the Compute UI and on the library page are all green.

One thing that catched my eye when checking the Driver Logs:
For a cluster without packages installed (i.e. running normally), the Standard error looks like this:

ANTLR Tool version 4.8 used for code generation does not match the current runtime version 4.9.3
ANTLR Tool version 4.8 used for code generation does not match the current runtime version 4.9.3
ANTLR Tool version 4.8 used for code generation does not match the current runtime version 4.9.3
ANTLR Tool version 4.8 used for code generation does not match the current runtime version 4.9.3
Wed Nov 15 08:28:57 2023 Connection to spark from PID  1157
Wed Nov 15 08:28:57 2023 Initialized gateway on port 39503
Wed Nov 15 08:28:58 2023 Connected to spark.

On a cluster with packages installed, the last three lines are missing.

Also, after a few minutes, the Event log shows the message METASTORE DOWN for the cluster with libraries installed.

I would greatly appreciate any thoughts on this! 

1 ACCEPTED SOLUTION

Accepted Solutions

jannikj
New Contributor III

Hi,

actually we were able to fix the issue, but I completely forgot about my post.
The problem was, that some Databricks URLs were not accessible from the cluster due to firewall restrictions.
We had to make sure that all URLs from this list for the region our cluster is in was reachable: IP addresses and domains for Azure Databricks services and assets - Azure Databricks | Microsoft Lea...

This fixed the issue and the cluster now runs normally.

View solution in original post

3 REPLIES 3

nkraj
Esteemed Contributor III
Esteemed Contributor III

This can happen if Metastore client fails to initialize. With python libraries/Jars added, the REPL creation step involves adding installed libraries with Spark addJar operation. This would initialize metastore client and can get stuck if there is any problem with the initialization

Kindly also verify the metastore connectivity.

BhawaniD
New Contributor II

Did you manage to fix this issue? I am facing a similar situation while running a notebook to read the XML files from the storage account.

 

jannikj
New Contributor III

Hi,

actually we were able to fix the issue, but I completely forgot about my post.
The problem was, that some Databricks URLs were not accessible from the cluster due to firewall restrictions.
We had to make sure that all URLs from this list for the region our cluster is in was reachable: IP addresses and domains for Azure Databricks services and assets - Azure Databricks | Microsoft Lea...

This fixed the issue and the cluster now runs normally.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group