Databricks Community

Soumik · ‎03-03-2025

Hi All,

I am trying to read an input_file.xlsx file using pandas.read_excel. I am using the below option

import pandas as pd

df = pd.read_excel(input_file, sheetname = sheetname, dtype = str, na_filter= False, keep_default_na = False

Not sure but the value #N/A is coming as null/NaN, whereas the values N/A, NA etc which are default na values are coming as string, which is expected. Do anyone knows a solution or workaround ?

Brahmareddy · ‎03-15-2025

Hi Soumik,

How are you doing today? As per my understanding, It looks like Pandas is still treating #N/A as a missing value because Excel considers it a special type of NA. Even though you've set na_filter=False and keep_default_na=False, Pandas might still be handling it differently. A good workaround is to explicitly set na_values=[] in read_excel, which tells Pandas not to treat anything as NaN. Try updating your code like this: df = pd.read_excel(input_file, sheet_name=sheetname, dtype=str, na_values=[], keep_default_na=False). This should keep #N/A as a string instead of converting it to null. Let me know if it helps!

Regards,

Brahma