Databricks Community

RajibRajib_Mand · ‎03-24-2022

We are getting

\u318a (ㆊ)

separated csv file. We want to create unmanaged table in databricks, Here is the table creation script.

create table IF NOT EXISTS db_test_raw.t_data_otc_poc

(`caseidt` String,

`worktype` String,

`doctyp` String,

`brand` String,

`reqemailid` String,

`subprocess` String,

`accountname` String,

`location` String,

`lineitems` String,

`emailsubject` String,

`createddate` string,

`process` String,

`archivalbatchid` String,

`createddt` String,

`customername` String,

`invoicetype` String,

`month` String,

`payernumber` String,

`sapaccountnumber` String,SOURCE_BUSINESS_DATE Date ) USING

CSV OPTIONS (header 'true',encoding 'UTF-8',quote '"', escape '"',delimiter '\u318a', path

'abfss://xxxx@yyyyy.dfs.core.windows.net/Raw/OPERATIONS/BUSINESSSERVICES/***/xx_DATA_OTC')

PARTITIONED BY (SOURCE_BUSINESS_DATE )

The table created successfully in databricks.

While checking (

describe table extended db_test_raw.t_data_otc_poc

), we found storage properties as [encoding=UTF-8, quote=", escape=", header=true, delimiter=?] .The delimiter got changed.

Can you please let us know what went wrong here?

Data is also loaded into first columns and value for the rest of the column is null

RajibRajib_Mand · ‎03-24-2022

Also all data loaded into single coumn.Value of the other column stored as null

Hubert-Dudek · ‎03-24-2022

sep "\u318a"

delimeter " \x318a"

sep " \x318a"

Try to use sep instead or/and x instead.

RajibRajib_Mand · ‎03-24-2022

Thanks @Hubert Dudek for your response. I tried with these options. Unfortunately it did not work

jose_gonzalez · ‎04-25-2022

Have you try to use "multiline" ? also try to read it using CSV to validate, then you can create the table, after you validate the data is correct.

For example:

df = spark.read

.option("header",true)

.option("multiLine",true)

.option("escape","_especial_value_")

.csv("path_to_CSV_data")

jose_gonzalez · ‎05-10-2022

Hi @Rajib Rajib Mandal ,

Just a friendly follow-up. Do you still need help with this questions or not anymore? did any of our responses help to resolved the issue? if yes, please mark it as best.

RajibRajib_Mand · ‎05-10-2022

Hi @Jose Gonzalez ,

Yes .I still need help .No one response yet.

Regards,

Rajib

jose_gonzalez · ‎06-13-2022

Hi @Rajib Rajib Mandal,

What have you tried so far? It will hep us to narrow down the scope. Please share as much details as possible.

Databricks Community

Unicode field separator to create unamanged table in databricks for csv file

Photos

Connect with Databricks Users in Your Area

Intelligent Data Warehousing: AI/BI for Self-service Analytics

Share Your Thoughts on Databricks & Get Rewarded!

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Virtual Learning Festival: 9 April - 30 April

Data + AI Summit 2025 — registration now open!