Translations from T-SQL: TOP 1 OUTER APPLY or LEFT...

MattHeidebrecht · ‎12-05-2024

Hi All,

I am wondering how you would go about translating either of the below to Spark SQL in Databricks. They are more or less equivalent statements in T-SQL.

Please note that I am attempting to pair each unique Policy (IPI_ID) record with its highest numbered Location (IL_ID) record. There can be many Location records for each Policy record. The Location table links to the Policy table via Policy.IPI_ID = Location.IL_IPI_ID.

I have tried to utilize LIMIT 1 in certain ways (example further below) but either receive errors or the results do not match.

Any help or suggestions are appreciated!

T-SQL:

select
	ipi.IPI_ID
	,loc.IL_ID
from Policy ipi
outer apply
	(
	select top 1 il.IL_ID
	from Location il
	where il.IL_IPI_ID = ipi.IPI_ID
	order by
		il.IL_ID desc
	) loc

--

select
	ipi.IPI_ID
	,il.IL_ID
from Policy ipi
left join Location il
	on il.IL_ID =
		(
		select top 1 il2.IL_ID
		from Location il2
		where il2.IL_IPI_ID = ipi.IPI_ID
		order by
			il2.IL_ID desc
		)

Errors out in Databricks Spark SQL:

select
	ipi.IPI_ID
	,il.IL_ID
from Policy ipi
left join Location il
	on il.IL_ID =
	(
	select il2.IL_ID
	from Location il2
	where il2.IL_IPI_ID = ipi.IPI_ID
	order by
		il2.IL_ID desc
  limit 1
  );

Translations from T-SQL: TOP 1 OUTER APPLY or LEFT JOIN