Databricks Community

RajibRajib_Mand · ‎03-24-2022

We are getting

\u318a (ㆊ)

separated csv file. We want to create unmanaged table in databricks, Here is the table creation script.

create table IF NOT EXISTS db_test_raw.t_data_otc_poc

(`caseidt` String,

`worktype` String,

`doctyp` String,

`brand` String,

`reqemailid` String,

`subprocess` String,

`accountname` String,

`location` String,

`lineitems` String,

`emailsubject` String,

`createddate` string,

`process` String,

`archivalbatchid` String,

`createddt` String,

`customername` String,

`invoicetype` String,

`month` String,

`payernumber` String,

`sapaccountnumber` String,SOURCE_BUSINESS_DATE Date ) USING

CSV OPTIONS (header 'true',encoding 'UTF-8',quote '"', escape '"',delimiter '\u318a', path

'abfss://xxxx@yyyyy.dfs.core.windows.net/Raw/OPERATIONS/BUSINESSSERVICES/***/xx_DATA_OTC')

PARTITIONED BY (SOURCE_BUSINESS_DATE )

The table created successfully in databricks.

While checking (

describe table extended db_test_raw.t_data_otc_poc

), we found storage properties as [encoding=UTF-8, quote=", escape=", header=true, delimiter=?] .The delimiter got changed.

Can you please let us know what went wrong here?

Data is also loaded into first columns and value for the rest of the column is null

RajibRajib_Mand · ‎03-24-2022

Also all data loaded into single coumn.Value of the other column stored as null

Hubert-Dudek · ‎03-24-2022

sep "\u318a"

delimeter " \x318a"

sep " \x318a"

Try to use sep instead or/and x instead.

RajibRajib_Mand · ‎03-24-2022

Thanks @Hubert Dudek for your response. I tried with these options. Unfortunately it did not work

jose_gonzalez · ‎04-25-2022

Have you try to use "multiline" ? also try to read it using CSV to validate, then you can create the table, after you validate the data is correct.

For example:

df = spark.read

.option("header",true)

.option("multiLine",true)

.option("escape","_especial_value_")

.csv("path_to_CSV_data")

jose_gonzalez · ‎05-10-2022

Hi @Rajib Rajib Mandal ,

Just a friendly follow-up. Do you still need help with this questions or not anymore? did any of our responses help to resolved the issue? if yes, please mark it as best.

RajibRajib_Mand · ‎05-10-2022

Hi @Jose Gonzalez ,

Yes .I still need help .No one response yet.

Regards,

Rajib

jose_gonzalez · ‎06-13-2022

Hi @Rajib Rajib Mandal,

What have you tried so far? It will hep us to narrow down the scope. Please share as much details as possible.

Databricks Community

Unicode field separator to create unamanged table in databricks for csv file

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Milestone: DatabricksTV Reaches 100 Videos!

Announcing the new Meta Llama 3.3 model on Databricks

Databricks Community Champion - December 2024 - Sujesh Menon

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences