UTF-8 troubles in DLT
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-25-2024 03:11 AM
Issues with UTF-8 in DLT
I am having issues with UTF-8 in DLT:
I have tried to set the spark config on the cluster running the DLT pipeline:
I have fixed this with normal compute under advanced settings like this:
spark.conf.set("spark.driver.extraJavaOptions", "-Dfile.encoding=UTF-8")
spark.conf.set("spark.executor.extraJavaOptions", "-Dfile.encoding=UTF-8")
However, this does not work with DLT. Have any of you guys figured this out?
- Eirik
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-03-2024 12:42 AM
Hi @Retired_mod!
Sorry for a long wait...
The problem is not the columns or the data itself, the UTF-8 option for csv is working fine. The issue is with table_names not being compatible it seems. If I run the query through Auto Loader outside DLT and use backticks for catalog_name, schema_name and table_name, likt this: `dev`.`bronze`.`bokføring` it works perfectly.
Is there anyway that this can be done in DLT? Do you know the timeline when the runtime will be upgraded so that it will work?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-02-2024 02:41 AM