Inconsistent behaviour in group by and order by
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-26-2024 03:27 AM
Consider this minimal example:
with t as (select explode(sequence(1,10,1)) as id)
select (id%2) as id from t
group by id
order by id
I would expect an ambiguous column name exception, since the grouping and sorting could apply to 2 different `id` columns. Instead the grouping is applied to t.id and the order is applied to (t.id%2).
Is there a setting we can apply to trigger an error when there is ambiguity? Can this be escalated?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-26-2024 05:58 AM
Create a Spark issue
https://issues.apache.org/jira/projects/SPARK/issues/SPARK-48139?filter=allopenissues
~
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-26-2024 06:32 AM
Hi,
This is not an issue, pls understand order of execution of SQL queries. "Order by" clause will always refer to columns selected / displayed (as you are referring as id everywhere I guess there is a confusion).
Ambiguous column name exception occurs when you refer same column names from two tables without aliasing it differently.
I hope it helps!!