cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Spark Read CSV doesn't preserve the double quotes while reading!

DineshKumar
New Contributor III

Hi , I am trying to read a csv file with one column has double quotes like below.

James,Butt,"Benton, John B Jr",6649 N Blue Gum St
Josephine,Darakjy,"Chanay, Jeffrey A Esq",4 B Blue Ridge Blvd
Art,Venere,"Chemel, James L Cpa",8 W Cerritos Ave #54
Lenna,Paprocki,Feltz Printing Service,639 Main St,Anchorage
Donette,Foller,Printing Dimensions,34 Center St,Hamilton
Simona,Morasca,"Chapman, Ross E Esq",3 Mcauley Dr 

I am using the below code to keep the double quotes as its from the csv file.(few rows having double quotes and few dont)

val df_usdata    = spark.read.format("com.databricks.spark.csv")// 
.option("header","true")//
.option("quote","\"")// 
.load("file:///E://data//csvdata.csv")
df_usdata.show(false)     

But it didn't preserve the double quotes inside the dataframe but it should be.

The .option("quote","\"") is not working. Am using Spark 2.3.1 version.

The output should be like below.

+----------+---------+-------------------------+---------------------+
|first_name|last_name|company_name             |address              | 
+----------+---------+-------------------------+---------------------+ 
|James     |Butt     |"Benton, John B Jr"      |6649 N Blue Gum St   |
|Josephine |Darakjy  |"Chanay, Jeffrey A Esq"  |4 B Blue Ridge Blvd  |
|Art       |Venere   |"Chemel, James L Cpa"    |8 W Cerritos Ave #54 |
|Lenna     |Paprocki |Feltz Printing Service   |639 Main St          | 
|Donette   |Foller   |Printing Dimensions      |34 Center St         | 
|Simona    |Morasca  |"Chapman, Ross E Esq"    |3 Mcauley Dr         |
+----------+---------+-------------------------+---------------------+ 

Regards, Dinesh Kumar

5 REPLIES 5

DineshKumar
New Contributor III

When I tried with

.option("quote","")
and .option("quote","\u0000") the company_name column values got splitted into next column like below.

+----------+---------+-------------------------+---------------------+
|first_name|last_name|company_name             |address              |
+----------+---------+-------------------------+---------------------+
|James     |Butt     |"Benton                  | John B Jr"          |
|Josephine |Darakjy  |"Chanay                  | Jeffrey A Esq"      |
|Art       |Venere   |"Chemel                  | James L Cpa"        |
|Lenna     |Paprocki |Feltz Printing Service   |639 Main St          |
|Donette   |Foller   |Printing Dimensions      |34 Center St         |
|Simona    |Morasca  |"Chapman                 | Ross E Esq"         |
+----------+---------+-------------------------+---------------------+

Forum_Admin
Contributor

Try using both of these options :

.option("quote", "\"")

.option("escape", "\"")

Thanks, it resolves my issue with the csv generation

Munni
New Contributor II

Hai Currently,I am also facing same issue,please let me know how this issue resolved.

Thanks,

Munni

LearningAj
New Contributor II

Hi Team,

I am also facing same issue and i have applied all the option mentioned from above posts:

I will just post my dataset here:

LearningAj_0-1691694163947.png

Attached is the my input data with 3 different column out of which comment column contains text value with double quotes and commas and to read this dataset i ave used all escape options but still comment column's data is moving to third column.

Below is the dataset from csv after performing read:

LearningAj_1-1691694446508.png

Could you please help on this issue ASAP.

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group