cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Exception "java.nio.charset.MalformedInputException: Input length = 1" when creating data profile on Docker Container Service (10.4 LTS)

adrianna2942842
New Contributor III

I am encountering an issue while attempting to create a data profile on clusters using Docker Container Service (version 10.4 LTS). I keep receiving the following exception:

java.nio.charset.MalformedInputException: Input length = 1

What's puzzling is that I have tested the data profile creation process on clusters without Docker, using the same library dependencies, and it works flawlessly. However, when utilizing Docker Container Service, this exception consistently occurs regardless of the input data.

I have made several attempts with different data sets, but the problem persists. I suspect that Docker Container Service may be interfering with the character encoding or input handling in some way, leading to this exception.

Has anyone else encountered a similar issue with Docker Container Service and the

java.nio.charset.MalformedInputException? I would greatly appreciate any insights, experiences, or possible solutions to help me resolve this problem.

1 ACCEPTED SOLUTION

Accepted Solutions

User16752242622
Valued Contributor

The MalformedInputException is an exception in the java.nio.charset package in Java that indicates that an input sequence is malformed or cannot be decoded correctly using a specific character set.

```java.nio.charset.MalformedInputException``` is caused by the default locale settings difference in the DCS cluster. After setting the below environment variables in the DCS cluster environment variables, you should be able to run your code fine.

Kindly add the below settings in the environment variable:

LANG=C.UTF-8

LC_ALL=C.UTF-8

By setting LANG=C.UTF-8 and LC_ALL=C.UTF-8, you are configuring the locale to use the UTF-8 character encoding, which can help address issues related to character encoding and malformed input when working with Java processes.

View solution in original post

3 REPLIES 3

User16752242622
Valued Contributor

The MalformedInputException is an exception in the java.nio.charset package in Java that indicates that an input sequence is malformed or cannot be decoded correctly using a specific character set.

```java.nio.charset.MalformedInputException``` is caused by the default locale settings difference in the DCS cluster. After setting the below environment variables in the DCS cluster environment variables, you should be able to run your code fine.

Kindly add the below settings in the environment variable:

LANG=C.UTF-8

LC_ALL=C.UTF-8

By setting LANG=C.UTF-8 and LC_ALL=C.UTF-8, you are configuring the locale to use the UTF-8 character encoding, which can help address issues related to character encoding and malformed input when working with Java processes.

Thank you for a reply! I have checked that the above solution fixed the exception.

Vartika
Moderator
Moderator

Hi @Adrianna Klank​,

We haven't heard from you since the last response from @Akash Bhat​​, and I was checking back to see if the suggestion helped you.

Or else, If you have any solution, please share it with the community, as it can be helpful to others. 

Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.