Databricks Community

jordan72 · ‎06-30-2025

Hi,

I have the issue that German Umlauts are not getting retrieved correctly via the JDBC driver.

It shows M�nchen instead of München.

I load the driver in my java app via:

<groupId>com.databricks</groupId>
<artifactId>databricks-jdbc</artifactId>
<version>2.7.3</version>

and set the charsets via:

System.setProperty("file.encoding", "UTF-8");
System.setProperty("sun.jnu.encoding", "UTF-8");

In the Databricks UI everything look correctly. The column type is STRING.

Regards

Volker Jordan

jordan72 · ‎06-30-2025

ok, so it seems that it has something to do with the newly introduced native.encoding system property.

So In Netbeans you have to provide -Dstdout.encoding=utf-8 to the vm if you are using JDK21.

View solution in original post

szymon_dybczak · ‎06-30-2025

Yes, this is exactly what the link I provided above suggested:

View solution in original post

szymon_dybczak · ‎06-30-2025

Hi @jordan72 ,

Maybe try to add to your jdbc connection url following parameters:

- CharacterEncoding=UTF-8;

- UseUnicode=true;

- CharSet=UTF-8;

String url = "jdbc:databricks://<your-host>:443/default;transportMode=http;ssl=1;httpPath=<http-path>;AuthMech=3;UID=token;PWD=<token>;CharSet=UTF-8;characterEncoding=UTF-8;UseUnicode=true;";

jordan72 · ‎06-30-2025

I already tried all those parameters, but nothing changed.

Surprisingly, in DataGrip (which also used the JDBC driver) the results are correct. And I copied the url from DataGrip into a raw Java IDE, and here it does not work.

szymon_dybczak · ‎06-30-2025

Ok, thanks for additional information. So maybe the issue is somehow related to JVM environment.
I noticed that you're setting following property: System.setProperty("file.encoding", "UTF-8");
Java sets file.encoding once at JVM startup — setting it with System.setProperty at runtime has no effect on string decoding in most libraries, including JDBC drivers.

Try to launch your application with following VM option.

java -Dfile.encoding=UTF-8

szymon_dybczak · ‎06-30-2025

Another thought, you can check if this is not problem with your IDE configuration. Assuming you're using Intellij, then check your file encodings settings: Settings -> Editior -> File encodings.

Use the UTF-8, Luke! File Encodings in IntelliJ IDEA | The IntelliJ IDEA Blog

jordan72 · ‎06-30-2025

hm, now its getting even more weird. I usually use NetbeansIDE. I now tried the same code with Eclipse and here it worked without any special options. In Netbeans, even with

-Dfile.encoding=UTF-8

there is no change. Does anyone know what can lead Netbeans to this behaviour ?

szymon_dybczak · ‎06-30-2025

Ok, so that only confirms that this problem is not related to driver. Rather, this is weird quirk of Netbeans.
In netbeans it's not sufficient to use only option -Dfile.encoding=UTF-8.
Please follow approach suggested in following stackoverflow thread, depending on Java version you're using

java - How to use UTF-8 character in Netbeans - Stack Overflow

jordan72 · ‎06-30-2025

ok, so it seems that it has something to do with the newly introduced native.encoding system property.

So In Netbeans you have to provide -Dstdout.encoding=utf-8 to the vm if you are using JDK21.

szymon_dybczak · ‎06-30-2025

Yes, this is exactly what the link I provided above suggested:

Databricks Community

German Umlauts wrong via JDBC

Join Us as a Local Community Builder!

PSA: Community Edition retires at the end of 2025 - move to Free Edition today to keep your work.

🎤 Call for Presentations: Data + AI Summit 2026 is Open!

Last Chance: Help Shape the 2026 Data + AI Summit | Win a Full Conference Pass

🌟 Community Pulse: Your Weekly Roundup! December 05 – 11, 2025

Jaipur Usergroup First Virtual Meetup: AI/BI Genie + Data Science Careers — 19 Dec | 6 PM IST