cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Precision and scale is getting changed in the dataframe while casting to decimal

_Orc
New Contributor

When i run the below query in databricks sql the Precision and scale of the decimal column is getting changed.

Select typeof(COALESCE(Cast(3.45 as decimal(15,6)),0));

o/p: decimal(16,6)

expected o/p: decimal(15,6)

Any reason why the Precision and scale is getting changed?

1 ACCEPTED SOLUTION

Accepted Solutions

-werners-
Esteemed Contributor III

There is a reason indeed.

And the reason is type coercion:

In your coalesce, you enter 0 as a second value. 0 is an integer.

So Spark will coerce this to a decimal type. For not losing any information, it needs 10 digits in front of the comma (max value of a signed integer is 2147483647 -> 10 digits).

So when you put (15,6) you only have 9 digits => spark coerces this to 16,6.

The same for (8,0) => 10,0

And (12,0) remains 12,0.

If you want to avoid this coercion, you can pass 0.00 (decimal) instead of 0 (int).

I totally agree that the type coercion can be confusing btw.

View solution in original post

6 REPLIES 6

Kaniz_Fatma
Community Manager
Community Manager

Hi @Om Singh​ ! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will get back to you soon. Thanks.

-werners-
Esteemed Contributor III

There is a reason indeed.

And the reason is type coercion:

In your coalesce, you enter 0 as a second value. 0 is an integer.

So Spark will coerce this to a decimal type. For not losing any information, it needs 10 digits in front of the comma (max value of a signed integer is 2147483647 -> 10 digits).

So when you put (15,6) you only have 9 digits => spark coerces this to 16,6.

The same for (8,0) => 10,0

And (12,0) remains 12,0.

If you want to avoid this coercion, you can pass 0.00 (decimal) instead of 0 (int).

I totally agree that the type coercion can be confusing btw.

AmanSehgal
Honored Contributor III

I believe the default precision and scale changes as you change the scale.

For instance if you set precision and scale for casting as (4,0), then spark will default it to (10,0).

For (8,1), it'll default to (11,1)

For (8,2), it'll default to (12,2)

For (8,3), it'll default to (13,3)

For (8,4), it'll default to (14,4)

For (8,5), it'll default to (15,5)

For (8,6), it'll default to (16,6) and so on..

I think the logic is:

if precision-scale < 10:
    precision = 10+scale
else
    precision=precision

-werners-
Esteemed Contributor III

This is only correct if you pass an integer as 2nd argument in the coalesce.

F.e. Select typeof(COALESCE(Cast(3.45 as decimal(11,2)),cast(0 as long))) returns (22,2) because long requires 20 digits in front of the comma.

If you pass a decimal, other rules apply.

RKNutalapati
Valued Contributor

Hi @Om Singh​  - You can try below, Second statement will give you the required result

%sql
Select typeof(COALESCE(Cast(3.45 as decimal(15,6)),0)) as CastToDecimal
Union All
Select typeof(Cast(COALESCE(3.45,0) as decimal(15,6))) as CastToDecimal;
 

Result of above:

 

CastToDecimal

1 decimal(16,6)

2 decimal(15,6)

berserkersap
Contributor

You can use

typeof(COALESCE(Cast(3.45 as decimal(15,6)),0.0)); (instead of 0)

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!