03-24-2022 12:18 AM
We are getting
\u318a (ㆊ)
separated csv file. We want to create unmanaged table in databricks, Here is the table creation script.
create table IF NOT EXISTS db_test_raw.t_data_otc_poc
(`caseidt` String,
`worktype` String,
`doctyp` String,
`brand` String,
`reqemailid` String,
`subprocess` String,
`accountname` String,
`location` String,
`lineitems` String,
`emailsubject` String,
`createddate` string,
`process` String,
`archivalbatchid` String,
`createddt` String,
`customername` String,
`invoicetype` String,
`month` String,
`payernumber` String,
`sapaccountnumber` String,SOURCE_BUSINESS_DATE Date ) USING
CSV OPTIONS (header 'true',encoding 'UTF-8',quote '"', escape '"',delimiter '\u318a', path
'abfss://xxxx@yyyyy.dfs.core.windows.net/Raw/OPERATIONS/BUSINESSSERVICES/***/xx_DATA_OTC')
PARTITIONED BY (SOURCE_BUSINESS_DATE )
The table created successfully in databricks.
While checking (
describe table extended db_test_raw.t_data_otc_poc
), we found storage properties as [encoding=UTF-8, quote=", escape=", header=true, delimiter=?] .The delimiter got changed.
Can you please let us know what went wrong here?
Data is also loaded into first columns and value for the rest of the column is null
03-24-2022 12:21 AM
Also all data loaded into single coumn.Value of the other column stored as null
03-24-2022 04:02 AM
sep "\u318a"
delimeter " \x318a"
sep " \x318a"
Try to use sep instead or/and x instead.
03-24-2022 09:50 AM
Thanks @Hubert Dudek for your response. I tried with these options. Unfortunately it did not work
04-25-2022 02:33 PM
Have you try to use "multiline" ? also try to read it using CSV to validate, then you can create the table, after you validate the data is correct.
For example:
df = spark.read
.option("header",true)
.option("multiLine",true)
.option("escape","_especial_value_")
.csv("path_to_CSV_data")
05-10-2022 10:17 AM
Hi @Rajib Rajib Mandal ,
Just a friendly follow-up. Do you still need help with this questions or not anymore? did any of our responses help to resolved the issue? if yes, please mark it as best.
05-10-2022 08:29 PM
06-13-2022 02:16 PM
Hi @Rajib Rajib Mandal,
What have you tried so far? It will hep us to narrow down the scope. Please share as much details as possible.
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.