08-24-2020 09:52 AM
Hi , I am trying to read a csv file with one column has double quotes like below.
James,Butt,"Benton, John B Jr",6649 N Blue Gum St
Josephine,Darakjy,"Chanay, Jeffrey A Esq",4 B Blue Ridge Blvd
Art,Venere,"Chemel, James L Cpa",8 W Cerritos Ave #54
Lenna,Paprocki,Feltz Printing Service,639 Main St,Anchorage
Donette,Foller,Printing Dimensions,34 Center St,Hamilton
Simona,Morasca,"Chapman, Ross E Esq",3 Mcauley Dr I am using the below code to keep the double quotes as its from the csv file.(few rows having double quotes and few dont)
val df_usdata    = spark.read.format("com.databricks.spark.csv")// 
.option("header","true")//
.option("quote","\"")// 
.load("file:///E://data//csvdata.csv")
df_usdata.show(false)     But it didn't preserve the double quotes inside the dataframe but it should be.
The .option("quote","\"") is not working. Am using Spark 2.3.1 version.
The output should be like below.
+----------+---------+-------------------------+---------------------+
|first_name|last_name|company_name             |address              | 
+----------+---------+-------------------------+---------------------+ 
|James     |Butt     |"Benton, John B Jr"      |6649 N Blue Gum St   |
|Josephine |Darakjy  |"Chanay, Jeffrey A Esq"  |4 B Blue Ridge Blvd  |
|Art       |Venere   |"Chemel, James L Cpa"    |8 W Cerritos Ave #54 |
|Lenna     |Paprocki |Feltz Printing Service   |639 Main St          | 
|Donette   |Foller   |Printing Dimensions      |34 Center St         | 
|Simona    |Morasca  |"Chapman, Ross E Esq"    |3 Mcauley Dr         |
+----------+---------+-------------------------+---------------------+ Regards, Dinesh Kumar
08-25-2020 09:02 AM
When I tried with
.option("quote","")+----------+---------+-------------------------+---------------------+
|first_name|last_name|company_name             |address              |
+----------+---------+-------------------------+---------------------+
|James     |Butt     |"Benton                  | John B Jr"          |
|Josephine |Darakjy  |"Chanay                  | Jeffrey A Esq"      |
|Art       |Venere   |"Chemel                  | James L Cpa"        |
|Lenna     |Paprocki |Feltz Printing Service   |639 Main St          |
|Donette   |Foller   |Printing Dimensions      |34 Center St         |
|Simona    |Morasca  |"Chapman                 | Ross E Esq"         |
+----------+---------+-------------------------+---------------------+08-06-2021 02:16 AM
Try using both of these options :
.option("quote", "\"") .option("escape", "\"")01-21-2022 02:29 AM
Thanks, it resolves my issue with the csv generation
09-14-2022 10:15 PM
Hai Currently,I am also facing same issue,please let me know how this issue resolved.
Thanks,
Munni
08-10-2023 12:08 PM
Hi Team,
I am also facing same issue and i have applied all the option mentioned from above posts:
I will just post my dataset here:
Attached is the my input data with 3 different column out of which comment column contains text value with double quotes and commas and to read this dataset i ave used all escape options but still comment column's data is moving to third column.
Below is the dataset from csv after performing read:
Could you please help on this issue ASAP.
 
					
				
				
			
		
 
					
				
				
			
		
Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!
Sign Up Now