<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to handle multilines coming from CSV file in a quoted string in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-handle-multilines-coming-from-csv-file-in-a-quoted-string/m-p/27977#M19815</link>
    <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;Can you try escape parameter &amp;amp; quote parameter to indicate which characters need to be ignored. The escape character within the quotes will be ignored. you can specify the newline character, that it needs to be ignored. please refer to below documentation for more info&lt;/P&gt;
&lt;P&gt; &lt;A href="https://docs.databricks.com/spark/latest/data-sources/read-csv.html#reading-files" target="test_blank"&gt;https://docs.databricks.com/spark/latest/data-sources/read-csv.html#reading-files&lt;/A&gt;&lt;/P&gt;
&lt;UL&gt;&lt;LI&gt;&lt;PRE&gt;&lt;CODE&gt;quote&lt;/CODE&gt;&lt;/PRE&gt;: by default the quote character is &lt;PRE&gt;&lt;CODE&gt;"&lt;/CODE&gt;&lt;/PRE&gt;, but can be set to any character. Delimiters inside quotes are ignored.&lt;/LI&gt;&lt;LI&gt;&lt;PRE&gt;&lt;CODE&gt;escape&lt;/CODE&gt;&lt;/PRE&gt;: by default the escape character is &lt;PRE&gt;&lt;CODE&gt;\&lt;/CODE&gt;&lt;/PRE&gt;, but can be set to any character. Escaped quote characters are ignored.&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;Thanks&lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 13 Jun 2019 10:07:17 GMT</pubDate>
    <dc:creator>mathan_pillai</dc:creator>
    <dc:date>2019-06-13T10:07:17Z</dc:date>
    <item>
      <title>How to handle multilines coming from CSV file in a quoted string</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-handle-multilines-coming-from-csv-file-in-a-quoted-string/m-p/27974#M19812</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;How to handle multilines coming from CSV file in a quoted string&lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 12 Jun 2019 05:14:06 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-handle-multilines-coming-from-csv-file-in-a-quoted-string/m-p/27974#M19812</guid>
      <dc:creator>MounicaVemulapa</dc:creator>
      <dc:date>2019-06-12T05:14:06Z</dc:date>
    </item>
    <item>
      <title>Re: How to handle multilines coming from CSV file in a quoted string</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-handle-multilines-coming-from-csv-file-in-a-quoted-string/m-p/27975#M19813</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt; Hi @Mounica Vemulapalli &lt;/P&gt; 
&lt;P&gt; Do you mean how to handle multilines in the source csv file? While using spark.read API, did you try including the multiline option set to true? please try and let us know how it goes &lt;/P&gt; 
&lt;PRE&gt;&lt;CODE&gt;
.option("multiLine","true")
&lt;/CODE&gt;&lt;/PRE&gt; 
&lt;P&gt; Thanks &lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 12 Jun 2019 11:17:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-handle-multilines-coming-from-csv-file-in-a-quoted-string/m-p/27975#M19813</guid>
      <dc:creator>mathan_pillai</dc:creator>
      <dc:date>2019-06-12T11:17:02Z</dc:date>
    </item>
    <item>
      <title>Re: How to handle multilines coming from CSV file in a quoted string</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-handle-multilines-coming-from-csv-file-in-a-quoted-string/m-p/27976#M19814</link>
      <description>&lt;P&gt;@Mathan Pillai​&amp;nbsp; .. Yes I tried it.. But in the file, multiline of a column is considering as row itself&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 12 Jun 2019 11:47:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-handle-multilines-coming-from-csv-file-in-a-quoted-string/m-p/27976#M19814</guid>
      <dc:creator>MounicaVemulapa</dc:creator>
      <dc:date>2019-06-12T11:47:09Z</dc:date>
    </item>
    <item>
      <title>Re: How to handle multilines coming from CSV file in a quoted string</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-handle-multilines-coming-from-csv-file-in-a-quoted-string/m-p/27977#M19815</link>
      <description>&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;Can you try escape parameter &amp;amp; quote parameter to indicate which characters need to be ignored. The escape character within the quotes will be ignored. you can specify the newline character, that it needs to be ignored. please refer to below documentation for more info&lt;/P&gt;
&lt;P&gt; &lt;A href="https://docs.databricks.com/spark/latest/data-sources/read-csv.html#reading-files" target="test_blank"&gt;https://docs.databricks.com/spark/latest/data-sources/read-csv.html#reading-files&lt;/A&gt;&lt;/P&gt;
&lt;UL&gt;&lt;LI&gt;&lt;PRE&gt;&lt;CODE&gt;quote&lt;/CODE&gt;&lt;/PRE&gt;: by default the quote character is &lt;PRE&gt;&lt;CODE&gt;"&lt;/CODE&gt;&lt;/PRE&gt;, but can be set to any character. Delimiters inside quotes are ignored.&lt;/LI&gt;&lt;LI&gt;&lt;PRE&gt;&lt;CODE&gt;escape&lt;/CODE&gt;&lt;/PRE&gt;: by default the escape character is &lt;PRE&gt;&lt;CODE&gt;\&lt;/CODE&gt;&lt;/PRE&gt;, but can be set to any character. Escaped quote characters are ignored.&lt;/LI&gt;&lt;/UL&gt;
&lt;P&gt;Thanks&lt;/P&gt; 
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jun 2019 10:07:17 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-handle-multilines-coming-from-csv-file-in-a-quoted-string/m-p/27977#M19815</guid>
      <dc:creator>mathan_pillai</dc:creator>
      <dc:date>2019-06-13T10:07:17Z</dc:date>
    </item>
    <item>
      <title>Re: How to handle multilines coming from CSV file in a quoted string</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-handle-multilines-coming-from-csv-file-in-a-quoted-string/m-p/44999#M27766</link>
      <description>&lt;P&gt;In my case all three options are not working. still I am facing issue data is not properly separated&lt;/P&gt;&lt;PRE&gt;escape&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;.option("multiLine","true")&lt;/PRE&gt;&lt;PRE&gt;quote&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Sep 2023 11:42:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-handle-multilines-coming-from-csv-file-in-a-quoted-string/m-p/44999#M27766</guid>
      <dc:creator>dataengineerfro</dc:creator>
      <dc:date>2023-09-15T11:42:32Z</dc:date>
    </item>
  </channel>
</rss>

