01-18-2022 10:00 PM
Loaded a csv file with five columns into a dataframe, and then added around 15+ columns using dataframe.withColumn method.
After adding these many columns, when I run the query df.rdd.isEmpty() - which throws the below error.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 32.0 failed 4 times, most recent failure: Lost task 0.3 in stage 32.0 (TID 28) (10.139.64.4 executor 9): ExecutorLostFailure (executor 9 exited caused by one of the running tasks) Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.
Any idea what is the issue?
01-20-2022 03:45 AM
Please check your logs as it can be some other issue.
Please check also using bool(df.head(1)) instead.
01-19-2022 08:40 AM
Hello again, @Thushar R - I'm sorry to hear that you're having this difficulty also. Let's give the community a chance to respond first. Thanks in advance for your patience.
01-20-2022 03:45 AM
Please check your logs as it can be some other issue.
Please check also using bool(df.head(1)) instead.
01-20-2022 11:57 PM
Thanks for the workaround. But why this particular piece of code fails in 9.0 LTS runtime and run in 8.3 without issues. Any idea. Please see the code below.
from pyspark.sql.functions import lit,col,row_number,floor,trim
df = spark.read.option("header", "true").csv(filePath)
df2 = df.select(col("cc"),col("ac"),col("an"),\
col("ag"),col("at")).distinct()
lstOfMissingColumns = ['col1', 'col2', 'col3', 'col4', 'col5', 'col6', 'col7', 'col8', 'col8', 'col9','col9', 'col10', 'col11', 'col12', 'col13',
'col14', 'col15', 'col16', 'col17']
for c in lstOfMissingColumns:
df2 = df2.withColumn(c,lit(''))
df2.rdd.isEmpty()
02-23-2022 05:11 PM
Hi @Thushar R ,
Are you using the same CSV file?
the error message is
"Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages" which could be a OOM error. How big is your CSV file? have you check the executor's 9 logs?
02-16-2022 09:04 AM
@Thushar R - Thank you for your patience. We are looking for the best person to help you.
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.