03-24-2022 12:18 AM
We are getting
\u318a (ㆊ)
separated csv file. We want to create unmanaged table in databricks, Here is the table creation script.
create table IF NOT EXISTS db_test_raw.t_data_otc_poc
(`caseidt` String,
`worktype` String,
`doctyp` String,
`brand` String,
`reqemailid` String,
`subprocess` String,
`accountname` String,
`location` String,
`lineitems` String,
`emailsubject` String,
`createddate` string,
`process` String,
`archivalbatchid` String,
`createddt` String,
`customername` String,
`invoicetype` String,
`month` String,
`payernumber` String,
`sapaccountnumber` String,SOURCE_BUSINESS_DATE Date ) USING
CSV OPTIONS (header 'true',encoding 'UTF-8',quote '"', escape '"',delimiter '\u318a', path
'abfss://xxxx@yyyyy.dfs.core.windows.net/Raw/OPERATIONS/BUSINESSSERVICES/***/xx_DATA_OTC')
PARTITIONED BY (SOURCE_BUSINESS_DATE )
The table created successfully in databricks.
While checking (
describe table extended db_test_raw.t_data_otc_poc
), we found storage properties as [encoding=UTF-8, quote=", escape=", header=true, delimiter=?] .The delimiter got changed.
Can you please let us know what went wrong here?
Data is also loaded into first columns and value for the rest of the column is null
03-24-2022 12:21 AM
Also all data loaded into single coumn.Value of the other column stored as null
03-24-2022 04:02 AM
sep "\u318a"
delimeter " \x318a"
sep " \x318a"
Try to use sep instead or/and x instead.
03-24-2022 09:50 AM
Thanks @Hubert Dudek for your response. I tried with these options. Unfortunately it did not work
04-25-2022 02:33 PM
Have you try to use "multiline" ? also try to read it using CSV to validate, then you can create the table, after you validate the data is correct.
For example:
df = spark.read
.option("header",true)
.option("multiLine",true)
.option("escape","_especial_value_")
.csv("path_to_CSV_data")
05-10-2022 10:17 AM
Hi @Rajib Rajib Mandal ,
Just a friendly follow-up. Do you still need help with this questions or not anymore? did any of our responses help to resolved the issue? if yes, please mark it as best.
05-10-2022 08:29 PM
06-13-2022 02:16 PM
Hi @Rajib Rajib Mandal,
What have you tried so far? It will hep us to narrow down the scope. Please share as much details as possible.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group