cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.

Leodatabricks
Contributor

I have been getting this error sporadically. I'm loading a dataset and training a model using the dataset in notebook. Sometimes it works and sometimes it doesn't. I have seen similar posts and tried all solutions mentioned, log output size limit, spark.network.timeout configurations, creating a temporary view. Nothing fundamentally solved the issue. Sometimes it would work without any issues, and sometimes I would get the error above. But I'm pretty sure there is no memory issues and I have allocated enough cluster memory. Could you please shed some light on what is causing this issue? Especially I don't understand why it only breaks some time but not always. So really hard to pinpoint the issue. Thank you!

2 REPLIES 2

karthik_p
Esteemed Contributor

@Leo Bao​ Are you seeing this issue whenever you are getting different sizes of data sets, or your data set size is same. if issue you are seeing is due to larger dataset, please check below link and try to increase partition size Databricks Spark Pyspark RDD Repartition - "Remote RPC client disassociated. Likely due to container...

Thank you for your reply! It is happening whenever I use different sizes of data sets. But it's not because the dataset is larger, even when it's smaller there's issues. Just curious is there a rule of thumb for the size of each partition so it might work? Also I did try to adjust the partition size and still sometimes it works and sometimes it doesn't.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group