cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Internal GRPC errors when using databricks connect

marcelhfm
New Contributor II

Hey there, 

in our local development flow we heavily rely on databricks asset bundles and databricks connect. Recently, locally run workflows (i.e. just pyspark python files) have begun to frequently fail with the following grpc error:

pyspark.errors.exceptions.connect.SparkConnectGrpcException: <_MultiThreadedRendezvous of RPC that terminated with:
        status = StatusCode.INTERNAL
        details = "Cannot operate on a handle that is closed."
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"Cannot operate on a handle that is closed.", grpc_status:13, created_time:"2025-03-17T15:51:24.396549+01:00"}"

This error is non-deterministic and cluster restarts sometimes allow us to run workflows once or twice before the error is appearing again. Might also be coincidental due to the non-deterministic nature, but it seems that some pypsark code fails more often with this error than others.

databricks-connect version: 15.4.7

databricks-sdk: 0.29.0

cluster runtime: 15.4 LTS (includes Apache Spark 3.5.0, Scala 2.12)


Researching this error returns basically zero results, so I'm asking if someone else has received and solved this before or this is some known issue?

Thanks!

7 REPLIES 7

somya
New Contributor II

@marcelhfm  Are you able to find a soln to this error?

marcelhfm
New Contributor II

No, unfortunately not. Have you encountered similar behavior before?

ChrisChieu
Databricks Employee
Databricks Employee

Hey @marcelhfm 
Two questions:
- Was your script working before with the same configuration? 
- What are you trying to do?

Hey,

Was your script working before with the same configuration?
- Yes, this error has only been coming up recently. And it is super undeterministic, comes up couple of times per week.

 

What are you trying to do?

- We're using Asset Bundles and Databricks Connect to develop pyspark tasks. More specifically, to speed up development flows, we develop pyspark tasks and execute them locally. Later they will be turned into DLT Workflows.

Any more information I can provide you?

cmathieu
New Contributor III

@ChrisChieu 

I've had the same issue happen to me today. A previously working workflow using serverless compute that is doing streaming foreachbatch operation. It sort of silently failed and ended up timing out. 

lukasstr
New Contributor II

@ChrisChieu 

I am encountering the exact same error as marcelhfm on my side as well. Have only very rarely encountered this issue in the past, usually upon a re-run everything worked fine.

Since Friday (note: no code or environment changes made), I'm encountering this issue on almost everything I try to run with databricks connect, which is a major problem for us as it renders most of our local workflows unusable. 

ChrisChieu
Databricks Employee
Databricks Employee

@marcelhfm it might be a Spark Connect issue
I would say it is the same for the rest of you, guys

Nothing much to do until the situation is fixed by Databricks