06-08-2022 11:47 PM
I have a nested struct , where on of the field is a string , it looks something like this ....
string =
"[{\"to_loc\":\"6183\",\"to_loc_type\":\"S\",\"qty_allocated\":\"18\"},{\"to_loc\":\"6137\",\"to_loc_type\":\"S\",\"qty_allocated\":\"9\"},{\"to_loc\":\"6088\",\"to_loc_type\":\"S\",\"qty_allocated\":\"9\"}]"
my goal is to get it into a Array of Struct , so that each struct in this string can be exploded into a new row . Like this ,
C1,C2, C3
_ , _ , {"to_loc\":\"6183\",\"to_loc_type\":\"S\",\"qty_allocated\":\"18\}
_ , _ , {"to_loc\":\"6137\",\"to_loc_type\":\"S\",\"qty_allocated\":\"9\}
_ , _ , {"to_loc\":\"6088\",\"to_loc_type\":\"S\",\"qty_allocated\":\"9\}
so that finally each of those keys can also be taken out as a new column
I've tried by casting the string column into array of struct , but spark is refusing to convert my string column . Any help on this
the final schema =
ArrayType(StructType(
[StructField("to_loc",StringType(),True),
StructField("to_loc_type",StringType(),True),
StructField("qty_allocated",StringType(),True)]
))
06-09-2022 01:00 AM
Ok , so I got it working .
Call the from_json() function with string column as input and the schema at second parameter . It will convert it into struct .
06-09-2022 12:31 AM
Ok this is not a complete answer, but my first guess would be to use the explode() or posexplode() function to create separate records of the array members.
06-09-2022 01:00 AM
Ok , so I got it working .
Call the from_json() function with string column as input and the schema at second parameter . It will convert it into struct .
06-09-2022 01:04 AM
Can you mark the question as answered so others can find the solution?
06-09-2022 02:43 AM
I've marked my comment as best . Does anything else need to be done ?
06-09-2022 03:11 AM
Nope 🙂