cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Doing a a join within the same row in SQL

qwerty1
Contributor

My data is a dump of JSON response from an API. The schema of the json is

col_name  data_type
 
data           array<struct<attributes:struct<name: String, age: Int relationships:struct<address:struct<data:arraay<struct<id: long, type: string>>>>>>>
 
included    array<struct<id: long, type: string, attributes:struct<address: string, postalCode: string, country: string>>>

 As you can see the column data contains an array of person details and includes a relationship to that person's address via an id. The column included contains the the actual address.

I want to transform this data into a new table where the person data includes the address. In short I want to get rid of this `included` business. I only have SQL to go with right now because I am using this in a STREAMING LIVE TABLE query.

1 ACCEPTED SOLUTION

Accepted Solutions

I used a similar solution (exploding only one column) and it worked

View solution in original post

2 REPLIES 2

@Kaniz Fatmaโ€‹ isn't this basically doing an "explode" on "data" and "included" and then joining them? We end up doing join on the whole data set instead of within the row.

I used a similar solution (exploding only one column) and it worked

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group