cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Bug with display function with structs?

VVM
New Contributor III

It appears to me that there's a deceptive bug when using the databricks display function and viewing struct data. For whatever reason, multiple spaces are cut down to only one:

from pyspark.sql.functions import struct, col
 
df = spark.createDataFrame([
  ("this has two  spaces", "this has three   spaces"),
  ("this has one space", "this has nospace")
], ["sc", "osc"])
 
df = df.select(struct(df.columns).alias("scstruct"))
 
display( df )

You'll see in the result that the values with 2, 3, and 4 spaces are cut down to single spaces.

I came across this while attempting to diagnose a regex -> due to this bug, I wasn't aware of what the data values actually were.

2 REPLIES 2

Anonymous
Not applicable

Hi @Patrick Mascari​ 

Great to meet you, and thanks for your question!

Let's see if your peers in the community have an answer to your question. Thanks.

VVM
New Contributor III

not so sure the community can help here as this appears to be a verifiable and reproducible Databricks bug?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group