cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

How to handle multilines coming from CSV file in a quoted string

MounicaVemulapa
New Contributor III

How to handle multilines coming from CSV file in a quoted string

4 REPLIES 4

mathan_pillai
Valued Contributor
Valued Contributor

Hi @Mounica Vemulapalli

Do you mean how to handle multilines in the source csv file? While using spark.read API, did you try including the multiline option set to true? please try and let us know how it goes


.option("multiLine","true")

Thanks

@Mathan Pillaiโ€‹  .. Yes I tried it.. But in the file, multiline of a column is considering as row itself

mathan_pillai
Valued Contributor
Valued Contributor

Hi,

Can you try escape parameter & quote parameter to indicate which characters need to be ignored. The escape character within the quotes will be ignored. you can specify the newline character, that it needs to be ignored. please refer to below documentation for more info

https://docs.databricks.com/spark/latest/data-sources/read-csv.html#reading-files

  • quote
    : by default the quote character is
    "
    , but can be set to any character. Delimiters inside quotes are ignored.
  • escape
    : by default the escape character is
    \
    , but can be set to any character. Escaped quote characters are ignored.

Thanks

In my case all three options are not working. still I am facing issue data is not properly separated

escape

 

.option("multiLine","true")
quote

 

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.