Databricks Community

jfpatenaude · ‎10-03-2024

I have a specific use case where I call another notebook using the

dbutils.notebook.run() function. The other notebook do some processing and return a string in the dbutils.notebook.exit() function to the caller notebook. The returned string has some french special characters in it like à, é, è and because of that, the calling notebook executes for about 5 minutes longer than the called notebook and eventually ends with an exception: com.databricks.WorkflowException: java.nio.charset.MalformedInputException: Input length = 1

If I remove the special characters, everything works fine. Same thing If I use the .encode('ascii', 'remove') function on my string, but I need to have the the correct string returned with my accents. Is there a way to preserve my string intact? I'm on the Databricks Runtime Version 13.3 LTS.

You can reproduce the behavior using these two simple notebooks.

Caller Notebook:

output = dbutils.notebook.run('called_notebook', 600)

Called Notebook:

dbutils.notebook.exit("La mise à jour des tables des données raffinées est terminée")

jennie258fitz · ‎10-03-2024

@jfpatenaude starbuckssecretmenu wrote:
I have a specific use case where I call another notebook using the
dbutils.notebook.run() function. The other notebook do some processing and return a string in the dbutils.notebook.exit() function to the caller notebook. The returned string has some french special characters in it like à, é, è and because of that, the calling notebook executes for about 5 minutes longer than the called notebook and eventually ends with an exception: com.databricks.WorkflowException: java.nio.charset.MalformedInputException: Input length = 1

If I remove the special characters, everything works fine. Same thing If I use the .encode('ascii', 'remove') function on my string, but I need to have the the correct string returned with my accents. Is there a way to preserve my string intact? I'm on the Databricks Runtime Version 13.3 LTS.

You can reproduce the behavior using these two simple notebooks.
Caller Notebook:
output = dbutils.notebook.run('called_notebook', 600)

Called Notebook:
dbutils.notebook.exit("La mise à jour des tables des données raffinées est terminée")

- Called Notebook

# Called Notebook
output_string = "La mise à jour des tables des données raffinées est terminée"
# Encode to UTF-8
output_bytes = output_string.encode('utf-8')
## Convert to string format before returnin
dbutils.notebook.exit(output_bytes.decode('utf-8'))

- Caller Notebook

# Caller Notebook
output = dbutils.notebook.run('called_notebook', 600)
# You may want to ensure it's in the right format after calling
output_string = output.encode('utf-8').decode('utf-8')
print(output_string)

Databricks Community

MalformedInputException when using extended ascii characters in dbutils.notebook.exit()

Connect with Databricks Users in Your Area

Databricks Learning Festival (Virtual): 15 January - 31 January 2025

Milestone: DatabricksTV Reaches 100 Videos!

Announcing the new Meta Llama 3.3 model on Databricks

Databricks Community Champion - December 2024 - Sujesh Menon

Dotmatics and Databricks Partner to Advance Scientific Intelligence in Life Sciences