cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

lateral view explode in databricks - need help

koantek_user
New Contributor
We are working on snowflake to databricks migration and we encountered the lateral flatten function of snowflake which we tried to convert to lateral view explode in databricks- but its output is a subset of lateral flatten
-----------------------
The output of this flatten contains a result containing 6 elements as shown on this page with  (index,value) being major in some of the queries 
 
In case of tables with multiple array columns - these index values are used to combine data stored in 2 different array columns by performing joins 
e.g. 
col1 <1,2,3>, col2 <abc,def,ghi>
 
indices of 1 and abc, 2 and def, 3 and ghi are same and they are related data values stored in 2 different columns in a table
--------------------------------------
However,in Databricks - its equivalent is LATERAL VIEW [OUTER] EXPLODE  - which just returns the exploded values and not their indices.
 
this doesn't allow us to combine such data items as mentioned above
 
---------------------------
Any suggestions would be helpful
 
1 REPLY 1

Brahmareddy
Honored Contributor

Hi AzureSnowflake,

I see you're migrating from Snowflake to Databricks and running into some issues with the LATERAL FLATTEN function in Snowflake. Specifically, you're finding that the LATERAL VIEW EXPLODE in Databricks isn't providing the full output, particularly the indices, that you need to join related data from multiple array columns.

To work around this in Databricks, you can manually generate indices for each element in the arrays using functions like posexplode, which provides both the index and value.

Please find the example below

SELECT
t.*,
p1.pos AS index,
p1.col1 AS value1,
p2.col2 AS value2
FROM
my_table t
LATERAL VIEW POSEXPLODE(t.col1) p1 AS index, col1
LATERAL VIEW POSEXPLODE(t.col2) p2 AS index, col2
WHERE
p1.index = p2.index

Just a thought, give a try and do let me know if it works!

Have a good day.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group