01-24-2023 12:16 PM
Noticed with a query based on inline select and joins fails to the client with 'Error occurred while deserializing arrow data' I.e the query succeeds on Databricks but client (DBeaver, AtScale) receives an error
The error is only noticed with Databricks SQL setup with preview as channel mode
The same query works fine with Databricks SQL, that has channel setup to current
The issue can be replicated with a query against samples.tpch.nation table
SELECT
SUM(`t_36`.`noq_empty`) `ne36`,
`t_36`.`nname` `nname`,
SUM(`t_94`.`c0`) `c94`
FROM
(
SELECT `t_59`.`noq_empty_gbakc13` `noq_empty`,
`t_59`.`n_gbakc10` `nname`
FROM
(
SELECT SUM(ntn1.n_nationkey) `noq_empty_gbakc13`,
ntn1.n_name `n_gbakc10`
FROM
samples.tpch.nation ntn1
WHERE
true
GROUP BY
`n_gbakc10`
) `t_59`
) `t_36`
JOIN (
SELECT
`t_135`.`c0` `c0`,
`t_135`.`nname` `nname`
FROM
(
SELECT
`t_134`.`ci` `c0`,
`t_134`.`nname` `nname`
FROM
(
SELECT
`t_133`.`ci_gbakc2` `ci`,
`t_133`.`sono_gbakc3` `nname`
FROM
( SELECT
SUM(ntn10.n_nationkey) `ci_gbakc2`,
ntn10.n_name `sono_gbakc3`
FROM
samples.tpch.nation ntn10
WHERE
true
GROUP BY
`sono_gbakc3`
HAVING
SUM(`ntn10`.`n_nationkey`) > 1
) `t_133`
) `t_134`
WHERE
`t_134`.`ci` > 1
) `t_135`
WHERE
`t_135`.`c0` > 1
) `t_94` ON `t_94`.`nname` = `t_36`.`nname`
GROUP BY
`t_36`.`nname`
HAVING
SUM(`t_94`.`c0`) > 10
So would like to understand is there going to be a change in how the result-set object is being serialized in the upcoming releases of DBSQL?
OR Is this a bug?
01-25-2023 01:00 AM
And where can we find samples.tpch.nation or notebook which creates it?
01-25-2023 09:33 AM
It's one of the sample datasets provided by Databricks
01-25-2023 07:00 PM
Opened an ES on this, looks like an issue with the Preview channel. Thanks for your help!
Excited to expand your horizons with us? Click here to Register and begin your journey to success!
Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!