a month ago - last edited a month ago
Hi All,
I am using Databricks Connect 14.3.2 with Databricks Runtime 14.3 LTS to execute the code below. The CSV file is only 7MB, the code runs without issues on Databricks Runtime 15+ clusters but consistently produces the error message shown below when using 14.3 LTS. Please advise.
SparkConnectGrpcException: (org.apache.spark.sql.connect.common.InvalidPlanInput) Not found any cached local relation with the hash: hash_guid in the session with sessionUUID session_guid.
import pandas as pd
from databricks.connect import DatabricksSession
cluster_id = '*****'
user_token = "*******"
host = 'https://***********.azuredatabricks.net/'
sp = DatabricksSession.builder.remote(host = host,cluster_id = cluster_id,token = user_token).getOrCreate()
df = sp.createDataFrame(pd.read_csv("C:\\temp\\data.csv"))
df.show(5)
Wednesday
As a workaround, please try the following Spark configuration, which seems to have resolved the issue for me on both 14.3 LTS and 15.4 LTS.
spark.conf.set("spark.sql.session.localRelationCacheThreshold", 64 * 1024 * 1024)
4 weeks ago
Same here, it was working yesterday, stopped Today.
4 weeks ago
I managed to upload data in batches of 500 rows.
4 weeks ago
Same here, it stopped working since yesterday. I can confirm it works with smaller data sets. Has there been any communication from Databricks about this?
4 weeks ago
I've raised this issue with Databricks support, and they are currently investigating it.
4 weeks ago
Hi @ahsan_aj and @MichalMazurek, We are looking into this. We will get back to you shortly.
4 weeks ago
Any updates on this issue? I ran into the same problem. Scripts were running just fine and without any changes they stopped working with this error message.
4 weeks ago
@Kaniz_Fatma, is there any update on the issue ?
@KBoogaard, I raised this with Microsoft support, however they are unable to replicate the issue. I have a call with them on Monday.
4 weeks ago
I'm having the same problem.
I create a Spark Dataframe from a pandas dataframe (10000 rows, between 500 and 800 columns) and want to upload them. Worked fine two weeks ago, now I'm getting the error. For some files it still works, others it works when reducing amount of rows and columns.
4 weeks ago
I got information from our Databricks manager, that this is a known issue that they are working on, although this takes a lot of time, for us it's a huge problem for going on production with this!
4 weeks ago
Hi @Kaniz_Fatma any news?
3 weeks ago
To to convert pd df to a dict before passing on
E.g. createDataFrame(pd.read_csv('abc').to.dict())
In my case there was some non serializable attribute on pd df which is dropped when converting to dict.
Btw logging in here and answering was an insane obstacle. My large company subscription is set wrong so we can't login into community because our admin accounts does not have literal email accounts. Attempt to setup a personal account was only possible when not at all using the corp WiFi, I get kicked
3 weeks ago
Hi @Kaniz_Fatma, this a really bad support experience. Is this how Databricks support manages issues? I am currently thinking about using a different solution, this is an outage for several days now.
3 weeks ago
No updates from my side either. We're currently using the 15.4 LTS runtime, and it's working fine. The issue seems to be with the 14.3 LTS version only.
3 weeks ago
Microsoft support confirmed that the fix has been merged and is set for release on September 3rd.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group