โ04-18-2023 01:04 AM
Hello experts. The below function in Databricks gives UNSUPPORTED_CORRELATED_SCALAR_SUBQUERY error. We didn't have this issue though in Oracle. Is this a limitation of Databricks? Just to note the final result returns only one row. Thank you in advance
create or replace function GET_PREV_DAY(vdate DATE, vdays_bef decimal(38) DEFAULT 1) returns DATE
return
with cte as
( select (t.DATE_FIELD), row_number() over (order by t.DATE_FIELD desc) rn
from time t
where t.DATE_FIELD = (p_date)
and t.WORK_DAYS = 1
) select (cte.DATE_FIELD) as l_result from cte where rn= p_days_before;
โ04-20-2023 07:22 PM
@ELENI GEORGOUSIโ :
The UNSUPPORTED_CORRELATED_SCALAR_SUBQUERY error occurs because the subquery in the WHERE clause contains a reference to the main query's column "p_date". This type of subquery is called a correlated scalar subquery and it is not supported in Databricks.
One way to rewrite this query to avoid the error is to use a JOIN or a subquery to filter the rows based on the DATE_FIELD column only, and then use the ROW_NUMBER function to get the row number of the desired row:
CREATE OR REPLACE FUNCTION GET_PREV_DAY(vdate DATE, vdays_bef decimal(38) DEFAULT 1)
RETURNS DATE
AS
$$
SELECT DATE_FIELD
FROM (
SELECT DATE_FIELD, ROW_NUMBER() OVER (ORDER BY DATE_FIELD DESC) AS rn
FROM time
WHERE WORK_DAYS = 1 AND DATE_FIELD < vdate
) subquery
WHERE rn = vdays_bef;
$$
In this rewritten query, the subquery filters the rows based on the WORK_DAYS and DATE_FIELD columns only, and then the main query selects the row with the desired row number. This approach should work in Databricks and also in Oracle.
โ04-21-2023 03:03 AM
Hello, we have changed the function as you mentioned but still the same issue is present.
%sql
create or replace function test_get(v_date DATE, days_before decimal(38) DEFAULT 1) returns DATE
return
select (xx) as l_result FROM
(select (t.xx), row_number() over (order by t.xx desc) as rn
from time_table t
where t.xx = (v_date)
and t.DA_WORK_DAYS = 1
) sub
where sub.rn= days_before;
analysis exception: Correlated scalar subqueries must be aggregated: Filter (cast(cast(rn#8811 as decimal(10,0)) as decimal(38,0)) = outer(p_days_before#8814))
+- SubqueryAlias sub
+- Window [xx#8846, row_number() windowspecdefinition(xx#8846 DESC NULLS LAST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rn#8811], [xx#8846 DESC NULLS LAST]
+- Project [xx#8846]
+- Filter ((xx#8846 = outer(p_date#8813)) AND (cast(DA_WORK_DAYS#8860 as int) = 1))
+- SubqueryAlias t
+- SubqueryAlias spark_catalog.time_table
+- Relation
โ05-08-2023 03:13 AM
Dear experts,
we have not received any answer on this please respond.
โ05-09-2023 01:29 AM
Hello @Suteja Kanuriโ can we have any respond on the above?
Thank you.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group