cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Decimal Precision error

sai_sathya
New Contributor III

when i try to create an dataframe like this

 

 

 

lstOfRange = list()

lstOfRange = [   ['CREDIT_LIMIT_RANGE',Decimal(10000000.010000),Decimal(100000000000000000000000.000000),'>10,000,000','G']     ] 
RangeSchema = StructType([StructField("rangeType",StringType(),True),
StructField("rangeFrom",DecimalType(32,6),True),
StructField("rangeTo",DecimalType(32,6),True),
StructField("rangeName",StringType(),True),
StructField("rangeOrder",StringType(),True)])

df = spark.createDataFrame(data = lstOfRange, schema = RangeSchema) 

 

 

 

and when i try to display df in rangeTo column iam getting this value 99999999999999991611392.000000

even if i try to print it is the same 

print(Decimal(100000000000000000000000.000000))
output : 99999999999999991611392
 
 
 
 
 

 

1 REPLY 1

Kaniz_Fatma
Community Manager
Community Manager

Hi @sai_sathyaThe issue youโ€™re encountering with the value in the rangeTo column of your DataFrame is related to the precision of floating-point numbers.

Letโ€™s break down whatโ€™s happening:

  1. Floating-Point Precision:

    • Computers represent floating-point numbers (like decimals) using a finite number of bits.
    • Due to this finite representation, some decimal values cannot be exactly represented in binary.
    • As a result, there can be small inaccuracies when working with very large or very small numbers.
  2. Decimal Representation:

    • In your case, the value 100000000000000000000000.000000 is not exactly representable as a binary floating-point number.
    • When you create a Decimal object with this value, it gets approximated to the nearest representable value.
  3. DataFrame Display Issue:

    • When you display the DataFrame, the value in the rangeTo column is being shown as 99999999999999991611392.000000.
    • This is due to the approximation error during the conversion from decimal to floating-point representation.
  4. Printing the Decimal Value:

    • When you print the Decimal(100000000000000000000000.000000) directly, you see the same approximate value (99999999999999991611392).
  5. Solution:

    • To handle decimal values with high precision, consider using the Decimal type consistently throughout your code.
    • If you need to display the value in a more human-readable format, you can format it explicitly (e.g., with a fixed number of decimal places).

Hereโ€™s an example of how you can format the value for display:

from decimal import Decimal

# Original value
original_value = Decimal("100000000000000000000000.000000")

# Display with 6 decimal places
formatted_value = format(original_value, ".6f")

print(f"Original value: {original_value}")
print(f"Formatted value: {formatted_value}")

Output:

Original value: 100000000000000000000000.000000
Formatted value: 100000000000000000000000.000000

Remember that the actual value stored in the DataFrame remains the same; itโ€™s just the display representation that differs due to floating-point limitations.

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!