cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages. Driver stacktrace

naveenreddy1
New Contributor II

We are using the databricks 3 node cluster with 32 GB memory. It is working fine but some times it automatically throwing the error: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues.

3 REPLIES 3

shyam_9
Valued Contributor
Valued Contributor

Hi @naveen reddy

If you have 3 nodes with 32 GB memory specified each you have just 30 GB for everything else, the different overheads add up quick and it's entirely possible that this is too little and the executors get killed for hogging the memory.

Try using something like 24 GB per node or just play around with the values.

I have already tried with increasing and decreasing the memory, still no luck.

RodrigoDe_Freit
New Contributor II

If your job fails follow this:

According to https://docs.databricks.com/jobs.html#jar-job-tips:

"Job output, such as log output emitted to stdout, is subject to a 20MB size limit. If the total output has a larger size, the run will be canceled and marked as failed."

That was my problem, to "fix it" I've just set the logging level to ERROR

val sc = SparkContext.getOrCreate(conf)

sc.setLogLevel("ERROR")

This workaround works for me

I still get this ERROR messages but the job runs successfully

I hope it helps

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.