Reading multi-dimensional json files
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
12-07-2022 05:31 AM
So I've been having some issues reading a json file that's been provided to the business with another nesting layer, so instead of a json being an:
- 'array of objects' -> [ {} ,{} ,{} ]
- It's an 'array of arrays of objects' -> [ [ {}, {} ,{} ], [ {} ,{} ,{} ] ]
While the first is alright to read with the multiline option with spark, the second case simply comes with the correct column schema, thought every columns is just a null value (actual file content looks good)
I've so far tried to create a custom struct schema to deal with the extra layer, but not had any luck to get it to work. Just returns nulls.
Is there something obvious that i'm missing?
- Labels:
-
JSON
-
Null Value
-
Schema
-
Structtype
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-30-2023 01:20 PM
You can use the explode function to flatten the array to rows, can you post a simple example of your data?
data:image/s3,"s3://crabby-images/2345c/2345ca6ff2e34b0d370ce03453929e5fd0c4a88d" alt=""
data:image/s3,"s3://crabby-images/2345c/2345ca6ff2e34b0d370ce03453929e5fd0c4a88d" alt=""