10-21-2024 01:58 AM
I have a dataframe containing several columns among which 1 contains, for one specific record, just a comma, nothing else.
When displaying the dataframe with the command
10-21-2024 02:03 AM
Here is a screenshot of my code and the output:
10-21-2024 03:11 AM - edited 10-21-2024 03:12 AM
Hi @fabien_arnaud,
I have tried to reproduce the issue using DBR 12.2 and in my case everything works as expected:
Could you share how this dataframe is created? Are you reading some csv file maybe?
Also, could you assign create a new dataframe:
df_filtered = df_input.where(col("erp_vendor_cd") == 'B6SA-VEN0008838').select(col("postal_cd"),col("state_cd"), col("state_nm"),col("country_cd"), col("country_nm"))
And then run:
df_filtered.printSchema()
df_filtered.show()
Let's check whether it is a problem with the dataframe or maybe display() function renders the dataframe incorrectly due to standalone comma.
10-21-2024 04:25 AM
Yes the dataframe reads from a CSV. Here is the code:
10-21-2024 05:05 AM
Hi @fabien_arnaud ,
I think I know the issue.
Could you please change your escape character (escape = '"') to be different than your quote character (quote = '"')?
For example set it to \.
In your csv there is a sequence like ","," and one of the quotes is used to escape comma.
Let us know if that helps
10-21-2024 06:28 AM
I actually can't change the escape character because the double quote is the one being used by the source file and is required to correctly parse other columns in the dataframe such as the case below where the name column contains double quotes in the data value:
As mentioned earlier though, the file can be read perfectly with Databricks runtime 15.4LTS so that will probably have to be the way forward. I hadn't upgraded yet because I had issues installing the various dependencies with the new Ubuntu version used by that runtime, but I did manage in the end.
I really appreciate the time you spent trying to help me out and your suggestions, Filip!
a month ago
Thank you so much for the solution.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group