- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-08-2021 04:47 AM
I’m trying to use sql query on azure-databricks with distinct sort and aliases
SELECT DISTINCT album.ArtistId AS my_alias
FROM album ORDER BY album.ArtistIdThe problem is that if I add an alias then I can not use not aliased name in the order by clause.
ORDER BY album.ArtistId part produces an error. ORDER BY my_alias works.
If I remove distinct it also works.
Seems like after Project step original column name is lost. That behavior is unexpected for SQL compared to other SQL dialects. And I can not find any documentation about it.
Error in SQL statement: AnalysisException: cannot resolve '`album.ArtistId`' given input columns: [my_alias]; line 2 pos 22;
'Sort ['album.ArtistId ASC NULLS FIRST], true
+- Distinct
+- Project [ArtistId#2506 AS my_alias#2500]
+- SubqueryAlias spark_catalog.chinook.album
+- Relation[AlbumId#2504,Title#2505,ArtistId#2506] parquetIs there any way to make this query run as is, without modification maybe by changing some execution flags?
The query is generated by sqlachemy and works on other databases fine, I prefer to not edit it for databricks.
The schema used for testing is Chinook database.
The where clause works fine
SELECT DISTINCT album.ArtistId AS my_alias
FROM album WHERE album.ArtistId = 1
- Labels:
-
Alias
-
Error Column
-
Order By
-
SQL
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-27-2021 10:14 AM
Hi ,
can you please try with the following.
%sql SELECT DISTINCT album.ArtistId AS my_alias
FROM album ORDER BY 1 descor
%sql SELECT DISTINCT album.ArtistId AS my_alias
FROM album ORDER BY my_alias desc
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-07-2021 01:31 PM
The code from above is worked in both cases.