cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Doing a a join within the same row in SQL

qwerty1
Contributor

My data is a dump of JSON response from an API. The schema of the json is

col_name  data_type
 
data           array<struct<attributes:struct<name: String, age: Int relationships:struct<address:struct<data:arraay<struct<id: long, type: string>>>>>>>
 
included    array<struct<id: long, type: string, attributes:struct<address: string, postalCode: string, country: string>>>

 As you can see the column data contains an array of person details and includes a relationship to that person's address via an id. The column included contains the the actual address.

I want to transform this data into a new table where the person data includes the address. In short I want to get rid of this `included` business. I only have SQL to go with right now because I am using this in a STREAMING LIVE TABLE query.

1 ACCEPTED SOLUTION

Accepted Solutions

I used a similar solution (exploding only one column) and it worked

View solution in original post

2 REPLIES 2

@Kaniz Fatma​ isn't this basically doing an "explode" on "data" and "included" and then joining them? We end up doing join on the whole data set instead of within the row.

I used a similar solution (exploding only one column) and it worked

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now