โ10-18-2021 06:49 AM
Data from external source is copied to ADLS, which further gets picked up by databricks, then this massaged data is put in the outbound file . A special character ? (question mark in black diamond) is seen in some fields in outbound file which may break existing code is not identified.
โ10-18-2021 08:38 AM
This needs encoding. you can try encoding the output while reading the file.
.option("encoding", "UTF-16LE")
Please refer to the below:
https://docs.microsoft.com/en-us/azure/databricks/kb/data-sources/json-unicode
โ10-18-2021 07:03 AM
Hi @Jazmine Kochanโ , what type of data is being copied? Does the data have any Unicode characters or symbols like รง รฃ,...?
โ10-18-2021 07:28 AM
Hi Prabakar,
Thanks for promt response.
It is a text file with customer data.
I have not seen such characters in the data but in text entry fields, this kind of data could be entered by client.
โ10-18-2021 07:44 AM
So yes, text could contain such characters.
โ10-18-2021 07:51 AM
So the cause of the issue is those Unicode characters. I believe there should be a fix for this. I shall check and get back here.
โ10-18-2021 07:58 AM
Thanks much!
โ10-18-2021 08:29 AM
Hi Prabakar
Could it be developer's code - which could be adding this special character?
โ10-18-2021 08:38 AM
This needs encoding. you can try encoding the output while reading the file.
.option("encoding", "UTF-16LE")
Please refer to the below:
https://docs.microsoft.com/en-us/azure/databricks/kb/data-sources/json-unicode
โ11-10-2021 01:30 PM
Do i need to encode and decode too?? Currently incorrect data is displayed @Prabakar Ammeappinโ
โ10-18-2021 08:04 AM
Are you sure it is Databricks which puts the special character in place?
It could also have happened during the copy of the external system to ADLS.
If you use Azure Data Factory f.e. you have to define the encoding (UTF-8 or UTF-16, ...)
โ10-18-2021 08:15 AM
Hi
Yes we checked all the files in the flow. It is output file from Databricks in which question mark character is seen at beginning of some lines in text fields.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group