Databricks Community

ahsan_aj · ‎08-20-2024

Hi All,

I am using Databricks Connect 14.3.2 with Databricks Runtime 14.3 LTS to execute the code below. The CSV file is only 7MB, the code runs without issues on Databricks Runtime 15+ clusters but consistently produces the error message shown below when using 14.3 LTS. Please advise.

SparkConnectGrpcException: (org.apache.spark.sql.connect.common.InvalidPlanInput) Not found any cached local relation with the hash: hash_guid in the session with sessionUUID session_guid.

import pandas as pd
from databricks.connect import DatabricksSession

cluster_id = '*****'
user_token = "*******"
host = 'https://***********.azuredatabricks.net/'

sp = DatabricksSession.builder.remote(host = host,cluster_id = cluster_id,token = user_token).getOrCreate()
df = sp.createDataFrame(pd.read_csv("C:\\temp\\data.csv"))
df.show(5)

ahsan_aj · ‎09-05-2024

Unfortunately I don't have this information. I only raised it for 14.3 LTS since my databricks connect version is the same (14.3.1).

ADuma · ‎09-05-2024

The error is now occuring for Cluster with Version 15.4. for me. Did the fix get released yet?

ahsan_aj · ‎09-05-2024

I don't think it has been released yet, and now I’m facing this issue on both 14.3 LTS and 15.4 LTS :(.

FYI @Retired_mod

asia_sowa · ‎09-05-2024

I have the same issue with 13.3 LTS version

LukeEs · ‎09-06-2024

I was informed that the fix was released. According to our tests, however, it did not fix anything. Instead, 15.4. LTS is now broken, too. This topic is getting urgent for us, now.

felix_ · ‎09-08-2024

Any updates on this? Facing the same issue with 15.4 LTS now as well..

ahsan_aj · ‎09-09-2024

Microsoft support just mentioned that fix has been deployed by Databricks, but the issue continues to persist for me on both 14.3 LTS and 15.4 LTS.

CarlDaniel · ‎09-09-2024

Now same issue with version 15.4 LTS. Does the fix for 14.3 LTS work? Thanks!

MichalMazurek · ‎09-10-2024

I still have the issue, but I noticed I do not have it on Linux. Support told me to use this line:

spark.conf.set("spark.sql.session.localRelationCacheThreshold", 64 * 1024 * 1024)

With that, it worked on windows, also this gives a hint what should be the batch size.

ahsan_aj · ‎09-10-2024

I have a troubleshooting session scheduled with Databricks today regarding this issue and will keep everyone updated on the progress.

ahsan_aj · ‎09-11-2024

As a workaround, please try the following Spark configuration, which seems to have resolved the issue for me on both 14.3 LTS and 15.4 LTS.

spark.conf.set("spark.sql.session.localRelationCacheThreshold", 64 * 1024 * 1024)

ahsan_aj · ‎09-11-2024

Databricks confirmed the same workaround while they work on a permanent fix.

Databricks Community

Databricks connect 14.3.2 SparkConnectGrpcException Not found any cached local relation withthe hash

Connect with Databricks Users in Your Area

Databricks Named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems

Announcing the new Meta Llama 3.3 model on Databricks

Milestone: DatabricksTV Reaches 100 Videos!

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences

Databricks Community Champion - December 2024 - Sujesh Menon