cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Precision and scale is getting changed in the dataframe while casting to decimal

_Orc
New Contributor

When i run the below query in databricks sql the Precision and scale of the decimal column is getting changed.

Select typeof(COALESCE(Cast(3.45 as decimal(15,6)),0));

o/p: decimal(16,6)

expected o/p: decimal(15,6)

Any reason why the Precision and scale is getting changed?

1 ACCEPTED SOLUTION

Accepted Solutions

-werners-
Esteemed Contributor III

There is a reason indeed.

And the reason is type coercion:

In your coalesce, you enter 0 as a second value. 0 is an integer.

So Spark will coerce this to a decimal type. For not losing any information, it needs 10 digits in front of the comma (max value of a signed integer is 2147483647 -> 10 digits).

So when you put (15,6) you only have 9 digits => spark coerces this to 16,6.

The same for (8,0) => 10,0

And (12,0) remains 12,0.

If you want to avoid this coercion, you can pass 0.00 (decimal) instead of 0 (int).

I totally agree that the type coercion can be confusing btw.

View solution in original post

6 REPLIES 6

Kaniz
Community Manager
Community Manager

Hi @Om Singhโ€‹ ! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will get back to you soon. Thanks.

-werners-
Esteemed Contributor III

There is a reason indeed.

And the reason is type coercion:

In your coalesce, you enter 0 as a second value. 0 is an integer.

So Spark will coerce this to a decimal type. For not losing any information, it needs 10 digits in front of the comma (max value of a signed integer is 2147483647 -> 10 digits).

So when you put (15,6) you only have 9 digits => spark coerces this to 16,6.

The same for (8,0) => 10,0

And (12,0) remains 12,0.

If you want to avoid this coercion, you can pass 0.00 (decimal) instead of 0 (int).

I totally agree that the type coercion can be confusing btw.

AmanSehgal
Honored Contributor III

I believe the default precision and scale changes as you change the scale.

For instance if you set precision and scale for casting as (4,0), then spark will default it to (10,0).

For (8,1), it'll default to (11,1)

For (8,2), it'll default to (12,2)

For (8,3), it'll default to (13,3)

For (8,4), it'll default to (14,4)

For (8,5), it'll default to (15,5)

For (8,6), it'll default to (16,6) and so on..

I think the logic is:

if precision-scale < 10:
    precision = 10+scale
else
    precision=precision

-werners-
Esteemed Contributor III

This is only correct if you pass an integer as 2nd argument in the coalesce.

F.e. Select typeof(COALESCE(Cast(3.45 as decimal(11,2)),cast(0 as long))) returns (22,2) because long requires 20 digits in front of the comma.

If you pass a decimal, other rules apply.

RKNutalapati
Valued Contributor

Hi @Om Singhโ€‹  - You can try below, Second statement will give you the required result

%sql
Select typeof(COALESCE(Cast(3.45 as decimal(15,6)),0)) as CastToDecimal
Union All
Select typeof(Cast(COALESCE(3.45,0) as decimal(15,6))) as CastToDecimal;
 

Result of above:

 

CastToDecimal

1 decimal(16,6)

2 decimal(15,6)

berserkersap
Contributor

You can use

typeof(COALESCE(Cast(3.45 as decimal(15,6)),0.0)); (instead of 0)

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.