cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

dataframe - cast string to decimal when encountering zeros returns OE-16

shan_chandra
Honored Contributor III
Honored Contributor III

The user is trying to cast string to decimal when encountering zeros. The cast function displays the  '0' as '0E-16'. could you please let us know your thoughts on whether 0s can be displayed as 0s?

from pyspark.sql import functions as F
df = spark.sql("select cast('0' AS decimal(38,16)) as decimal_number union all select cast('1.0000123400000' AS decimal(38,16))")
df2 = df.withColumn("string_column", F.expr("format_number(decimal_number, '0.######################')"))
display(df2)  

Screen Shot 2022-03-09 at 12.13.11 PM

1 ACCEPTED SOLUTION

Accepted Solutions

shan_chandra
Honored Contributor III
Honored Contributor III

If the scale of decimal type is greater than 6, scientific notation kicks in hence seeing 0E-16.

This behavior is described in the existing OSS spark issue - https://issues.apache.org/jira/browse/SPARK-25177

Kindly cast the column to a decimal type less than or equal to 6 to have zeros displayed as zeros.

from pyspark.sql import functions as F
df = spark.sql("select cast('0' AS decimal(10,6)) as decimal_number union all select cast('1.0000123400000' AS decimal(4,2))")
df2 = df.withColumn("string_column", F.expr("format_number(decimal_number, '0.######################')"))
display(df2)  

View solution in original post

1 REPLY 1

shan_chandra
Honored Contributor III
Honored Contributor III

If the scale of decimal type is greater than 6, scientific notation kicks in hence seeing 0E-16.

This behavior is described in the existing OSS spark issue - https://issues.apache.org/jira/browse/SPARK-25177

Kindly cast the column to a decimal type less than or equal to 6 to have zeros displayed as zeros.

from pyspark.sql import functions as F
df = spark.sql("select cast('0' AS decimal(10,6)) as decimal_number union all select cast('1.0000123400000' AS decimal(4,2))")
df2 = df.withColumn("string_column", F.expr("format_number(decimal_number, '0.######################')"))
display(df2)  

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.