cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Handle comma inside cell of CSV

AnandJ_Kadhi
New Contributor II

We are using spark-csv_2.10 > version 1.5.0

and reading the csv file column which contains comma " , " as one of the character. The problem we are facing is like that it treats the rest of line after the comma as new column and data is not interpreted properly due to that.

Can you please suggest any solution over the same ?

2 REPLIES 2

osamakhn
New Contributor II

I have been solving this with a pandas intermediary function but spark solution would be helpful! I am willing to contribute as well if anyone can point me in the right direction

User16857282152
Contributor

Take a look here for options,

http://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=dataframereader#pyspark.sq...

If a csv file has commas then the tradition is to quote the string that contains the comma,

In particular see if adding some of the options from that documentation such as.

quote – sets a single character used for escaping quoted values where the separator can be part of the value. If None is set, it uses the default value,

"
. If you would like to turn off quotations, you need to set an empty string.

Also,

You may have poorly formatted data, in that case you might need to read the whole line as a string and then parse as a dataframe with single column and use tools to split the string to create the needed final dataframe

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!