Data shifted when a pyspark dataframe column only contains a comma

fabien_arnaud
New Contributor II

I have a dataframe containing several columns among which 1 contains, for one specific record, just a comma, nothing else.

When displaying the dataframe with the command

display(df_input.where(col("erp_vendor_cd") == 'B6SA-VEN0008838'))
 
The data is displayed correctly for all of my columns
 
However, when I select specific columns from the same dataframe, i.e.
 
display(df_input.where(col("erp_vendor_cd") == 'B6SA-VEN0008838').select(col("postal_cd"),col("state_cd"), col("state_nm"),col("country_cd"), col("country_nm")))
 
all of my data from columns to the right of the one that only contains the comma gets shifted to the left. The comma seems to be identified as a column separator during the "select" although everything is correctly loaded in my dataframe.
 How can I avoid this behavior?
 
I use databricks runtime 12.2LTS and my notebook is in python.