<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: An unidentified special character is added in outbound file when transformed in databricks. Please help with suggestion? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13087#M7811</link>
    <description>&lt;P&gt;Hi Prabakar&lt;/P&gt;&lt;P&gt;Could it be developer's code - which could be adding this special character?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 18 Oct 2021 15:29:54 GMT</pubDate>
    <dc:creator>JK2021</dc:creator>
    <dc:date>2021-10-18T15:29:54Z</dc:date>
    <item>
      <title>An unidentified special character is added in outbound file when transformed in databricks. Please help with suggestion?</title>
      <link>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13079#M7803</link>
      <description>&lt;P&gt;Data from external source is copied to ADLS, which further gets picked up by databricks, then this massaged data is put in the outbound file . A special character ? (question mark in black diamond) is seen in some fields in outbound file which may break existing code is not identified.&lt;/P&gt;</description>
      <pubDate>Mon, 18 Oct 2021 13:49:00 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13079#M7803</guid>
      <dc:creator>JK2021</dc:creator>
      <dc:date>2021-10-18T13:49:00Z</dc:date>
    </item>
    <item>
      <title>Re: An unidentified special character is added in outbound file when transformed in databricks. Please help with suggestion?</title>
      <link>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13080#M7804</link>
      <description>&lt;P&gt;Hi @Jazmine Kochan​&amp;nbsp;, what type of data is being copied? Does the data have any Unicode characters or symbols like ç ã,...?&lt;/P&gt;</description>
      <pubDate>Mon, 18 Oct 2021 14:03:38 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13080#M7804</guid>
      <dc:creator>Prabakar</dc:creator>
      <dc:date>2021-10-18T14:03:38Z</dc:date>
    </item>
    <item>
      <title>Re: An unidentified special character is added in outbound file when transformed in databricks. Please help with suggestion?</title>
      <link>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13081#M7805</link>
      <description>&lt;P&gt;Hi Prabakar,&lt;/P&gt;&lt;P&gt;Thanks for promt response.&lt;/P&gt;&lt;P&gt;It is a text file with customer data. &lt;/P&gt;&lt;P&gt;I have not seen such characters in the data but in text entry fields, this kind of data could be entered by client.&lt;/P&gt;</description>
      <pubDate>Mon, 18 Oct 2021 14:28:52 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13081#M7805</guid>
      <dc:creator>JK2021</dc:creator>
      <dc:date>2021-10-18T14:28:52Z</dc:date>
    </item>
    <item>
      <title>Re: An unidentified special character is added in outbound file when transformed in databricks. Please help with suggestion?</title>
      <link>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13082#M7806</link>
      <description>&lt;P&gt;So yes, text could contain such characters.&lt;/P&gt;</description>
      <pubDate>Mon, 18 Oct 2021 14:44:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13082#M7806</guid>
      <dc:creator>JK2021</dc:creator>
      <dc:date>2021-10-18T14:44:54Z</dc:date>
    </item>
    <item>
      <title>Re: An unidentified special character is added in outbound file when transformed in databricks. Please help with suggestion?</title>
      <link>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13083#M7807</link>
      <description>&lt;P&gt;So the cause of the issue is those Unicode characters. I believe there should be a fix for this. I shall check and get back here.&lt;/P&gt;</description>
      <pubDate>Mon, 18 Oct 2021 14:51:12 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13083#M7807</guid>
      <dc:creator>Prabakar</dc:creator>
      <dc:date>2021-10-18T14:51:12Z</dc:date>
    </item>
    <item>
      <title>Re: An unidentified special character is added in outbound file when transformed in databricks. Please help with suggestion?</title>
      <link>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13084#M7808</link>
      <description>&lt;P&gt;Thanks much!&lt;/P&gt;</description>
      <pubDate>Mon, 18 Oct 2021 14:58:32 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13084#M7808</guid>
      <dc:creator>JK2021</dc:creator>
      <dc:date>2021-10-18T14:58:32Z</dc:date>
    </item>
    <item>
      <title>Re: An unidentified special character is added in outbound file when transformed in databricks. Please help with suggestion?</title>
      <link>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13085#M7809</link>
      <description>&lt;P&gt;Are you sure it is Databricks which puts the special character in place?&lt;/P&gt;&lt;P&gt;It could also have happened during the copy of the external system to ADLS.&lt;/P&gt;&lt;P&gt;If you use Azure Data Factory f.e. you have to define the encoding (UTF-8 or UTF-16, ...)&lt;/P&gt;</description>
      <pubDate>Mon, 18 Oct 2021 15:04:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13085#M7809</guid>
      <dc:creator>-werners-</dc:creator>
      <dc:date>2021-10-18T15:04:44Z</dc:date>
    </item>
    <item>
      <title>Re: An unidentified special character is added in outbound file when transformed in databricks. Please help with suggestion?</title>
      <link>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13086#M7810</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;&lt;P&gt;Yes we checked all the files in the flow. It is output file from Databricks in which question mark character is seen at beginning of some lines in text fields.&lt;/P&gt;</description>
      <pubDate>Mon, 18 Oct 2021 15:15:37 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13086#M7810</guid>
      <dc:creator>JK2021</dc:creator>
      <dc:date>2021-10-18T15:15:37Z</dc:date>
    </item>
    <item>
      <title>Re: An unidentified special character is added in outbound file when transformed in databricks. Please help with suggestion?</title>
      <link>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13087#M7811</link>
      <description>&lt;P&gt;Hi Prabakar&lt;/P&gt;&lt;P&gt;Could it be developer's code - which could be adding this special character?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 18 Oct 2021 15:29:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13087#M7811</guid>
      <dc:creator>JK2021</dc:creator>
      <dc:date>2021-10-18T15:29:54Z</dc:date>
    </item>
    <item>
      <title>Re: An unidentified special character is added in outbound file when transformed in databricks. Please help with suggestion?</title>
      <link>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13088#M7812</link>
      <description>&lt;P&gt;This needs encoding. you can try encoding the output while reading the file.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;.option("encoding", "UTF-16LE")&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Please refer to the below:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.microsoft.com/en-us/azure/databricks/kb/data-sources/json-unicode" target="test_blank"&gt;https://docs.microsoft.com/en-us/azure/databricks/kb/data-sources/json-unicode&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.databricks.com/s/question/0D53f00001HKHnfCAH/issues-with-utf16-files-and-unicode-characters" target="test_blank"&gt;https://community.databricks.com/s/question/0D53f00001HKHnfCAH/issues-with-utf16-files-and-unicode-characters&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 18 Oct 2021 15:38:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13088#M7812</guid>
      <dc:creator>Prabakar</dc:creator>
      <dc:date>2021-10-18T15:38:18Z</dc:date>
    </item>
    <item>
      <title>Re: An unidentified special character is added in outbound file when transformed in databricks. Please help with suggestion?</title>
      <link>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13089#M7813</link>
      <description>&lt;P&gt;Do i need to encode and decode too?? Currently incorrect data is displayed @Prabakar Ammeappin​&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 10 Nov 2021 21:30:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/an-unidentified-special-character-is-added-in-outbound-file-when/m-p/13089#M7813</guid>
      <dc:creator>JK2021</dc:creator>
      <dc:date>2021-11-10T21:30:29Z</dc:date>
    </item>
  </channel>
</rss>

