unable to replace null with 0 in dataframe using Pyspark databricks notebook (community edition)

db-avengers2rul
Contributor II

Hello Experts,

I am unable to replace nulls with 0 in a dataframe ,please refer to the screen shot

from pyspark.sql.functions import col
emp_csv_df = emp_csv_df.na.fill(0).withColumn("Total_Sal",col('sal')+col('comm'))
display(emp_csv_df)

erorr

unable to fill nulls with 0 in dataframe using PySpark in databricks 

desired output

Screenshot 2022-10-03 at 20.26.23 

any suggestions ?

Regards,

Rakesh

Hubert-Dudek
Databricks MVP

I bet that it is not real null but the string "null". Please check what is in the source and try luck with replacing it.


My blog: https://databrickster.medium.com/

View solution in original post