<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Databricks runtime 14.3 gives error scala.math.BigInt cannot be cast to java.lang.Integer in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/databricks-runtime-14-3-gives-error-scala-math-bigint-cannot-be/m-p/64383#M6894</link>
    <description>&lt;P&gt;The same bug is affecting me, but only when using Databricks runtime 14.3 LTS single user cluster.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm using row_number() on string columns, which should result in an integer. However, Spark internally seems to raise an error on not being able to convert Long and Int.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Example code:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;WITH A AS (
  SELECT
  --opportunityid is a string
    ROW_NUMBER() OVER(ORDER BY tbl_a.opportunityid) AS PK_A
  FROM 20_silver_crmeiw.opportunities AS tbl_a
)

, B AS (
  SELECT
  --mcw_contractid is a string
    ROW_NUMBER() OVER(ORDER BY tbl_b.mcw_contractid) AS PK_B
  FROM 20_silver_crmeiw.mcw_contracts AS tbl_b
)

, C AS (
  SELECT
    PK_A
  FROM A

  UNION
  
  SELECT
    PK_B
  FROM B
)

SELECT
  *
FROM A
LEFT JOIN C
ON 1=1&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Without the join, this code is fine:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;WITH A AS (
  SELECT
  --opportunityid is a string
    ROW_NUMBER() OVER(ORDER BY tbl_a.opportunityid) AS PK_A
  FROM 20_silver_crmeiw.opportunities AS tbl_a
)

, B AS (
  SELECT
  --mcw_contractid is a string
    ROW_NUMBER() OVER(ORDER BY tbl_b.mcw_contractid) AS PK_B
  FROM 20_silver_crmeiw.mcw_contracts AS tbl_b
)

SELECT
PK_A
FROM A

UNION

SELECT
PK_B
FROM B&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Querying SELECT typeof(PK_A) FROM A and SELECT typeof(PK_B) FROM B both return 'int'.&lt;/P&gt;</description>
    <pubDate>Fri, 22 Mar 2024 09:36:21 GMT</pubDate>
    <dc:creator>jeroenvs</dc:creator>
    <dc:date>2024-03-22T09:36:21Z</dc:date>
    <item>
      <title>Databricks runtime 14.3 gives error scala.math.BigInt cannot be cast to java.lang.Integer</title>
      <link>https://community.databricks.com/t5/get-started-discussions/databricks-runtime-14-3-gives-error-scala-math-bigint-cannot-be/m-p/62603#M6889</link>
      <description>&lt;P&gt;We have a cluster running on 13.3 LTS (includes Apache Spark 3.4.1, Scala 2.12).&lt;BR /&gt;We want to test with a different type of cluster (14.3 LTS (includes Apache Spark 3.5.0, Scala 2.12))&lt;/P&gt;&lt;P&gt;And all of a sudden we get errors that complain about a casting a Bigint to a java.lang.integer. Running the same (sql) query on the cluster with version 13.3 does not give any errors.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does anyone recognize this type of issue and has anybody come up with an idea how to fix this?&lt;/P&gt;</description>
      <pubDate>Mon, 04 Mar 2024 16:20:00 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/databricks-runtime-14-3-gives-error-scala-math-bigint-cannot-be/m-p/62603#M6889</guid>
      <dc:creator>HeijerM84</dc:creator>
      <dc:date>2024-03-04T16:20:00Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks runtime 14.3 gives error scala.math.BigInt cannot be cast to java.lang.Integer</title>
      <link>https://community.databricks.com/t5/get-started-discussions/databricks-runtime-14-3-gives-error-scala-math-bigint-cannot-be/m-p/62739#M6892</link>
      <description>&lt;P&gt;Hi, Thanks for your response. The error shows the below message as the place where the error is thrown. I don't see how this helps me, as I don't have much knowledge of Spark and Java. Does this message make any sense to you and if so, does that result in a more specific idea on where to find a solution?&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-left" image-alt="error databricks.jpg" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/6517i437F55D9E600F4E3/image-size/large/is-moderation-mode/true?v=v2&amp;amp;px=999" role="button" title="error databricks.jpg" alt="error databricks.jpg" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 06 Mar 2024 10:28:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/databricks-runtime-14-3-gives-error-scala-math-bigint-cannot-be/m-p/62739#M6892</guid>
      <dc:creator>HeijerM84</dc:creator>
      <dc:date>2024-03-06T10:28:13Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks runtime 14.3 gives error scala.math.BigInt cannot be cast to java.lang.Integer</title>
      <link>https://community.databricks.com/t5/get-started-discussions/databricks-runtime-14-3-gives-error-scala-math-bigint-cannot-be/m-p/62755#M6893</link>
      <description>&lt;P&gt;I have found when the issue arises. Below is a simplified version of the situation.&lt;/P&gt;&lt;P&gt;I create a temporary VIEW called ‘v_dim_One’ with a random column and a rownum which has a maximum value of for example 200.&lt;/P&gt;&lt;P&gt;%sql&lt;/P&gt;&lt;P&gt;CREATE OR REPLACE GLOBAL TEMPORARY VIEW v_dim_One AS&lt;/P&gt;&lt;P&gt;SELECT&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; row_number() over (order by &amp;nbsp;ColA asc) AS DimA_NK&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; , ColA&lt;/P&gt;&lt;P&gt;FROM TableOne&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Then I create a second temporary view that has a UNION in it and JOINS the ‘v_dim_One’ to return the rownum column ‘DimA_NK’.&lt;/P&gt;&lt;P&gt;%sql&lt;/P&gt;&lt;P&gt;CREATE OR REPLACE GLOBAL TEMPORARY VIEW v_dim_Two AS&lt;/P&gt;&lt;P&gt;SELECT&lt;/P&gt;&lt;P&gt;ColB&lt;/P&gt;&lt;P&gt;, DimA_NK&lt;/P&gt;&lt;P&gt;FROM TableTwoFirst&lt;/P&gt;&lt;P&gt;LEFT JOIN global_temp.v_dim_One ON ColA=ColB&lt;/P&gt;&lt;P&gt;UNION&lt;/P&gt;&lt;P&gt;SELECT&lt;/P&gt;&lt;P&gt;ColB&lt;/P&gt;&lt;P&gt;, DimA_NK&lt;/P&gt;&lt;P&gt;FROM TableTwoSecond&lt;/P&gt;&lt;P&gt;LEFT JOIN global_temp.v_dim_One ON ColA=ColB&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Now when I do a select * from &amp;nbsp;global_temp.v_dim_Two the previously described error occurs.&lt;/P&gt;&lt;P&gt;I did some extensive tests. The error will not occur in the following scenarios:&lt;BR /&gt;- dim_Two only selects from TableTwoFirst (so only the part above the union)&lt;/P&gt;&lt;P&gt;- dim_Two only selects from TableTwoSecond (so only the part below the union)&lt;/P&gt;&lt;P&gt;- Directly query the statement used to create temp view ‘v_dim_Two’&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I can prevent the error from occurring by casting the dimA_NK as a long:&lt;/P&gt;&lt;P&gt;%sql&lt;/P&gt;&lt;P&gt;CREATE OR REPLACE GLOBAL TEMPORARY VIEW v_dim_Two AS&lt;/P&gt;&lt;P&gt;SELECT&lt;/P&gt;&lt;P&gt;ColB&lt;/P&gt;&lt;P&gt;, CAST(DimA_NK as long) as DimA_NK --&amp;gt; CAST AS LONG&lt;/P&gt;&lt;P&gt;FROM TableTwoFirst&lt;/P&gt;&lt;P&gt;LEFT JOIN TableOne ON ColA=ColB&lt;/P&gt;&lt;P&gt;UNION&lt;/P&gt;&lt;P&gt;SELECT&lt;/P&gt;&lt;P&gt;ColB&lt;/P&gt;&lt;P&gt;, CAST(DimA_NK as long) as DimA_NK --&amp;gt; CAST AS LONG&lt;/P&gt;&lt;P&gt;FROM TableTwoSecond&lt;/P&gt;&lt;P&gt;LEFT JOIN TableOne ON ColA=ColB&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So conclusion, however weird, is to cast every column which is coming from a joined table and which is created through a row_number function, as a LONG whenever a UNION is used. If anyone can explain why this is required even when the rownum value is very small, I am open for reasoning!&lt;/P&gt;</description>
      <pubDate>Wed, 06 Mar 2024 14:27:21 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/databricks-runtime-14-3-gives-error-scala-math-bigint-cannot-be/m-p/62755#M6893</guid>
      <dc:creator>HeijerM84</dc:creator>
      <dc:date>2024-03-06T14:27:21Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks runtime 14.3 gives error scala.math.BigInt cannot be cast to java.lang.Integer</title>
      <link>https://community.databricks.com/t5/get-started-discussions/databricks-runtime-14-3-gives-error-scala-math-bigint-cannot-be/m-p/64383#M6894</link>
      <description>&lt;P&gt;The same bug is affecting me, but only when using Databricks runtime 14.3 LTS single user cluster.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm using row_number() on string columns, which should result in an integer. However, Spark internally seems to raise an error on not being able to convert Long and Int.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Example code:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;WITH A AS (
  SELECT
  --opportunityid is a string
    ROW_NUMBER() OVER(ORDER BY tbl_a.opportunityid) AS PK_A
  FROM 20_silver_crmeiw.opportunities AS tbl_a
)

, B AS (
  SELECT
  --mcw_contractid is a string
    ROW_NUMBER() OVER(ORDER BY tbl_b.mcw_contractid) AS PK_B
  FROM 20_silver_crmeiw.mcw_contracts AS tbl_b
)

, C AS (
  SELECT
    PK_A
  FROM A

  UNION
  
  SELECT
    PK_B
  FROM B
)

SELECT
  *
FROM A
LEFT JOIN C
ON 1=1&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Without the join, this code is fine:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;WITH A AS (
  SELECT
  --opportunityid is a string
    ROW_NUMBER() OVER(ORDER BY tbl_a.opportunityid) AS PK_A
  FROM 20_silver_crmeiw.opportunities AS tbl_a
)

, B AS (
  SELECT
  --mcw_contractid is a string
    ROW_NUMBER() OVER(ORDER BY tbl_b.mcw_contractid) AS PK_B
  FROM 20_silver_crmeiw.mcw_contracts AS tbl_b
)

SELECT
PK_A
FROM A

UNION

SELECT
PK_B
FROM B&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Querying SELECT typeof(PK_A) FROM A and SELECT typeof(PK_B) FROM B both return 'int'.&lt;/P&gt;</description>
      <pubDate>Fri, 22 Mar 2024 09:36:21 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/databricks-runtime-14-3-gives-error-scala-math-bigint-cannot-be/m-p/64383#M6894</guid>
      <dc:creator>jeroenvs</dc:creator>
      <dc:date>2024-03-22T09:36:21Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks runtime 14.3 gives error scala.math.BigInt cannot be cast to java.lang.Integer</title>
      <link>https://community.databricks.com/t5/get-started-discussions/databricks-runtime-14-3-gives-error-scala-math-bigint-cannot-be/m-p/64635#M6896</link>
      <description>&lt;P&gt;The example is a simplified query to illustrate the situation. The join condition is not a factor in this issue. The factors that cause the issues are the row_number, union and layered select.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Mar 2024 12:41:58 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/databricks-runtime-14-3-gives-error-scala-math-bigint-cannot-be/m-p/64635#M6896</guid>
      <dc:creator>HeijerM84</dc:creator>
      <dc:date>2024-03-26T12:41:58Z</dc:date>
    </item>
    <item>
      <title>Re: Databricks runtime 14.3 gives error scala.math.BigInt cannot be cast to java.lang.Integer</title>
      <link>https://community.databricks.com/t5/get-started-discussions/databricks-runtime-14-3-gives-error-scala-math-bigint-cannot-be/m-p/64783#M6897</link>
      <description>&lt;P&gt;I have logged the issue with Microsoft last week and they confirmed it is a Databricks bug. A fix is supposedly being rolled out at the moment across Databricks regions.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P class="lia-indent-padding-left-30px"&gt;As anticipated,&amp;nbsp;we have engaged the Databricks core team to further investigate the issue and get an update.&lt;/P&gt;&lt;P class="lia-indent-padding-left-30px"&gt;&amp;nbsp;&lt;/P&gt;&lt;P class="lia-indent-padding-left-30px"&gt;They have confirmed&amp;nbsp;that the issue was caused by a regression in DBR14.3 and had been fixed and should be ready in all regions soon.&lt;/P&gt;&lt;P class="lia-indent-padding-left-30px"&gt;&amp;nbsp;&lt;/P&gt;&lt;P class="lia-indent-padding-left-30px"&gt;In meanwhile they also shared the following two configs that can be used to mitigate the issue:&lt;/P&gt;&lt;P class="lia-indent-padding-left-30px"&gt;spark.databricks.optimizer.estimateUnion.enabled&amp;nbsp;to false&lt;/P&gt;&lt;P class="lia-indent-padding-left-30px"&gt;or&amp;nbsp;&lt;/P&gt;&lt;P class="lia-indent-padding-left-30px"&gt;spark.databricks.optimizer.propagateStatsThroughWindow.enabled to false&lt;/P&gt;</description>
      <pubDate>Wed, 27 Mar 2024 10:39:00 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/databricks-runtime-14-3-gives-error-scala-math-bigint-cannot-be/m-p/64783#M6897</guid>
      <dc:creator>jeroenvs</dc:creator>
      <dc:date>2024-03-27T10:39:00Z</dc:date>
    </item>
  </channel>
</rss>

