Databricks Community

sebasv · ‎10-26-2024

Consider this minimal example:

with t as (select explode(sequence(1,10,1)) as id)
select (id%2) as id from t
group by id
order by id

I would expect an ambiguous column name exception, since the grouping and sorting could apply to 2 different `id` columns. Instead the grouping is applied to t.id and the order is applied to (t.id%2).

Is there a setting we can apply to trigger an error when there is ambiguity? Can this be escalated?

gchandra · ‎10-26-2024

Create a Spark issue

https://issues.apache.org/jira/projects/SPARK/issues/SPARK-48139?filter=allopenissues

~

SathyaSDE · ‎10-26-2024

Hi,

This is not an issue, pls understand order of execution of SQL queries. "Order by" clause will always refer to columns selected / displayed (as you are referring as id everywhere I guess there is a confusion).

Ambiguous column name exception occurs when you refer same column names from two tables without aliasing it differently.

I hope it helps!!