databricks-connector: Error: Cluster MASKED is in unexpected state Pending.
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-06-2023 01:30 PM - edited 07-06-2023 02:44 PM
Is there a way to make databricks-connector wait for cluster to be running?
Details:
databricks-connector==13.1.0 and the python minor version of cluster and environment are both 3.10
If the cluster is not running this will start it, but any commands after fail because it does not wait for the cluster to be ready:
from databricks.connect import DatabricksSession
from databricks.sdk.core import Config
# get spark session using Databricks SDK's Config class:
config = Config(
host=os.environ.get("DATABRICKS_HOST"),
token=os.environ.get("DATABRICKS_TOKEN"),
cluster_id=os.environ.get("DATABRICKS_CLUSTER_ID"),
)
spark = DatabricksSession.builder.sdkConfig(config).getOrCreate()
Any commands using `spark` after, fail like:
pyspark.errors.exceptions.connect.SparkConnectGrpcException: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.FAILED_PRECONDITION
details = "INVALID_STATE: Cluster [MASKED] is in unexpected state Pending."
debug_error_string = "UNKNOWN:Error received from peer {created_time:"2023-07-06T18:57:01.084365359+00:00", grpc_status:9, grpc_message:"INVALID_STATE: Cluster [MASKED] is in unexpected state Pending."}"
If the Cluster is already running everything works as expected.
I am trying to set up a test CI Job so this is kind of a pain because I have to either manually make sure the cluster is running or restart the job once it is.