02-22-2022 10:02 AM
When i run the below query in databricks sql the Precision and scale of the decimal column is getting changed.
Select typeof(COALESCE(Cast(3.45 as decimal(15,6)),0));
o/p: decimal(16,6)
expected o/p: decimal(15,6)
Any reason why the Precision and scale is getting changed?
02-22-2022 11:44 PM
There is a reason indeed.
And the reason is type coercion:
In your coalesce, you enter 0 as a second value. 0 is an integer.
So Spark will coerce this to a decimal type. For not losing any information, it needs 10 digits in front of the comma (max value of a signed integer is 2147483647 -> 10 digits).
So when you put (15,6) you only have 9 digits => spark coerces this to 16,6.
The same for (8,0) => 10,0
And (12,0) remains 12,0.
If you want to avoid this coercion, you can pass 0.00 (decimal) instead of 0 (int).
I totally agree that the type coercion can be confusing btw.
02-22-2022 11:44 PM
There is a reason indeed.
And the reason is type coercion:
In your coalesce, you enter 0 as a second value. 0 is an integer.
So Spark will coerce this to a decimal type. For not losing any information, it needs 10 digits in front of the comma (max value of a signed integer is 2147483647 -> 10 digits).
So when you put (15,6) you only have 9 digits => spark coerces this to 16,6.
The same for (8,0) => 10,0
And (12,0) remains 12,0.
If you want to avoid this coercion, you can pass 0.00 (decimal) instead of 0 (int).
I totally agree that the type coercion can be confusing btw.
02-22-2022 11:54 PM
I believe the default precision and scale changes as you change the scale.
For instance if you set precision and scale for casting as (4,0), then spark will default it to (10,0).
For (8,1), it'll default to (11,1)
For (8,2), it'll default to (12,2)
For (8,3), it'll default to (13,3)
For (8,4), it'll default to (14,4)
For (8,5), it'll default to (15,5)
For (8,6), it'll default to (16,6) and so on..
I think the logic is:
if precision-scale < 10:
precision = 10+scale
else
precision=precision
02-23-2022 12:18 AM
This is only correct if you pass an integer as 2nd argument in the coalesce.
F.e. Select typeof(COALESCE(Cast(3.45 as decimal(11,2)),cast(0 as long))) returns (22,2) because long requires 20 digits in front of the comma.
If you pass a decimal, other rules apply.
02-23-2022 06:21 AM
Hi @Om Singh - You can try below, Second statement will give you the required result
%sql
Select typeof(COALESCE(Cast(3.45 as decimal(15,6)),0)) as CastToDecimal
Union All
Select typeof(Cast(COALESCE(3.45,0) as decimal(15,6))) as CastToDecimal;
Result of above:
CastToDecimal
1 decimal(16,6)
2 decimal(15,6)
08-13-2022 12:05 PM
You can use
typeof(COALESCE(Cast(3.45 as decimal(15,6)),0.0)); (instead of 0)
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group