cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Query with distinct sort and alias produces error column not found

Kotofosonline
New Contributor III

I’m trying to use sql query on azure-databricks with distinct sort and aliases

SELECT DISTINCT album.ArtistId AS my_alias 
    FROM album ORDER BY album.ArtistId

The problem is that if I add an alias then I can not use not aliased name in the order by clause.

ORDER BY album.ArtistId part produces an error. ORDER BY my_alias works.

If I remove distinct it also works.

Seems like after Project step original column name is lost. That behavior is unexpected for SQL compared to other SQL dialects. And I can not find any documentation about it.

Error in SQL statement: AnalysisException: cannot resolve '`album.ArtistId`' given input columns: [my_alias]; line 2 pos 22;
    'Sort ['album.ArtistId ASC NULLS FIRST], true
    +- Distinct
       +- Project [ArtistId#2506 AS my_alias#2500]
          +- SubqueryAlias spark_catalog.chinook.album
             +- Relation[AlbumId#2504,Title#2505,ArtistId#2506] parquet

Is there any way to make this query run as is, without modification maybe by changing some execution flags?

The query is generated by sqlachemy and works on other databases fine, I prefer to not edit it for databricks.

The schema used for testing is Chinook database.

The where clause works fine

SELECT DISTINCT album.ArtistId AS my_alias 
    FROM album WHERE album.ArtistId = 1

1 ACCEPTED SOLUTION

Accepted Solutions

User16752246494
Contributor

Hi ,

can you please try with the following.

%sql SELECT DISTINCT album.ArtistId AS my_alias 
    FROM album ORDER BY 1 desc

 or

%sql SELECT DISTINCT album.ArtistId AS my_alias 
    FROM album ORDER BY my_alias desc

View solution in original post

2 REPLIES 2

User16752246494
Contributor

Hi ,

can you please try with the following.

%sql SELECT DISTINCT album.ArtistId AS my_alias 
    FROM album ORDER BY 1 desc

 or

%sql SELECT DISTINCT album.ArtistId AS my_alias 
    FROM album ORDER BY my_alias desc

Kotofosonline
New Contributor III

The code from above is worked in both cases.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group