โ07-06-2023 01:30 PM - edited โ07-06-2023 02:44 PM
Is there a way to make databricks-connector wait for cluster to be running?
Details:
databricks-connector==13.1.0 and the python minor version of cluster and environment are both 3.10
If the cluster is not running this will start it, but any commands after fail because it does not wait for the cluster to be ready:
from databricks.connect import DatabricksSession
from databricks.sdk.core import Config
# get spark session using Databricks SDK's Config class:
config = Config(
host=os.environ.get("DATABRICKS_HOST"),
token=os.environ.get("DATABRICKS_TOKEN"),
cluster_id=os.environ.get("DATABRICKS_CLUSTER_ID"),
)
spark = DatabricksSession.builder.sdkConfig(config).getOrCreate()
Any commands using `spark` after, fail like:
pyspark.errors.exceptions.connect.SparkConnectGrpcException: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.FAILED_PRECONDITION
details = "INVALID_STATE: Cluster [MASKED] is in unexpected state Pending."
debug_error_string = "UNKNOWN:Error received from peer {created_time:"2023-07-06T18:57:01.084365359+00:00", grpc_status:9, grpc_message:"INVALID_STATE: Cluster [MASKED] is in unexpected state Pending."}"
If the Cluster is already running everything works as expected.
I am trying to set up a test CI Job so this is kind of a pain because I have to either manually make sure the cluster is running or restart the job once it is.
โ07-06-2023 02:10 PM
Are you using db-connect 13?
โ07-06-2023 02:15 PM
If you want to use PySpark UDFs, itโs important that your development machineโs installed minor version of Python match the minor version of Python that is included with Databricks Runtime installed on the cluster.
Please refer to the document and check if your setup meets the required configuration. Databricks Connect | Databricks on AWS
For Databricks Runtime 13.0 and higher, Databricks Connect is now built on open-source Spark Connect.
โ07-06-2023 02:32 PM
Yes I am using 13.1.0 and the python minor version of cluster and environment are both 3.10. Sorry I should have put that in the question.
โ07-12-2023 02:41 AM
Hi @AFox
Thank you for posting your question in our community! We are happy to assist you.
To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?
This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!
โ07-12-2023 09:22 AM
The question has not been answered. databricks-connect does not wait for the selected cluster to start. This needs to be an option or the tool is not nearly as useful.
โ11-21-2023 11:28 AM
FYI for anyone that finds this: This seems to be resolved in databricks-connector 14+
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group