Function returns UNSUPPORTED_CORRELATED_SCALAR_SUBQUERY
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-18-2023 01:04 AM
Hello experts. The below function in Databricks gives UNSUPPORTED_CORRELATED_SCALAR_SUBQUERY error. We didn't have this issue though in Oracle. Is this a limitation of Databricks? Just to note the final result returns only one row. Thank you in advance
create or replace function GET_PREV_DAY(vdate DATE, vdays_bef decimal(38) DEFAULT 1) returns DATE
return
with cte as
( select (t.DATE_FIELD), row_number() over (order by t.DATE_FIELD desc) rn
from time t
where t.DATE_FIELD = (p_date)
and t.WORK_DAYS = 1
) select (cte.DATE_FIELD) as l_result from cte where rn= p_days_before;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-20-2023 07:22 PM
@ELENI GEORGOUSI :
The UNSUPPORTED_CORRELATED_SCALAR_SUBQUERY error occurs because the subquery in the WHERE clause contains a reference to the main query's column "p_date". This type of subquery is called a correlated scalar subquery and it is not supported in Databricks.
One way to rewrite this query to avoid the error is to use a JOIN or a subquery to filter the rows based on the DATE_FIELD column only, and then use the ROW_NUMBER function to get the row number of the desired row:
CREATE OR REPLACE FUNCTION GET_PREV_DAY(vdate DATE, vdays_bef decimal(38) DEFAULT 1)
RETURNS DATE
AS
$$
SELECT DATE_FIELD
FROM (
SELECT DATE_FIELD, ROW_NUMBER() OVER (ORDER BY DATE_FIELD DESC) AS rn
FROM time
WHERE WORK_DAYS = 1 AND DATE_FIELD < vdate
) subquery
WHERE rn = vdays_bef;
$$
In this rewritten query, the subquery filters the rows based on the WORK_DAYS and DATE_FIELD columns only, and then the main query selects the row with the desired row number. This approach should work in Databricks and also in Oracle.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-21-2023 03:03 AM
Hello, we have changed the function as you mentioned but still the same issue is present.
%sql
create or replace function test_get(v_date DATE, days_before decimal(38) DEFAULT 1) returns DATE
return
select (xx) as l_result FROM
(select (t.xx), row_number() over (order by t.xx desc) as rn
from time_table t
where t.xx = (v_date)
and t.DA_WORK_DAYS = 1
) sub
where sub.rn= days_before;
analysis exception: Correlated scalar subqueries must be aggregated: Filter (cast(cast(rn#8811 as decimal(10,0)) as decimal(38,0)) = outer(p_days_before#8814))
+- SubqueryAlias sub
+- Window [xx#8846, row_number() windowspecdefinition(xx#8846 DESC NULLS LAST, specifiedwindowframe(RowFrame, unboundedpreceding$(), currentrow$())) AS rn#8811], [xx#8846 DESC NULLS LAST]
+- Project [xx#8846]
+- Filter ((xx#8846 = outer(p_date#8813)) AND (cast(DA_WORK_DAYS#8860 as int) = 1))
+- SubqueryAlias t
+- SubqueryAlias spark_catalog.time_table
+- Relation
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-08-2023 03:13 AM
Dear experts,
we have not received any answer on this please respond.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-09-2023 01:29 AM
Hello @Suteja Kanuri can we have any respond on the above?
Thank you.