12-15-2022 08:44 PM
I have the following input:
I am looking for a way to split the characters in the item_order_detail column into 2 columns itemID and itemName. As below output table uses SQL function in databricks with spark_sql version 3.2.1.
Can someone suggest a solution for me?
Thanks all
12-15-2022 10:11 PM
Hi, you got to extract itemID and itemName from single column item_order_detail - you can make use of from_json function for that. I hope following code helps,
SELECT order_id, from_json(item_order_details, 'itemid STRING, itemname STRING') as items
FROM orders
SELECT order_id, items.itemid, items.itemname
FROM orders
LATERAL VIEW from_json(item_order_details, 'itemid STRING, itemname STRING') items as items
12-15-2022 10:42 PM
and pyspark for the same-
df = spark.table("orders")
df = df.withColumn("items", from_json(df["item_order_details"], "itemid STRING, itemname STRING"))
df = df.select("order_id", "items.itemid", "items.itemname")
12-16-2022 08:02 AM
Hi @Wasim Inamdar
I tried it and have error bellow:
SELECT order_id, from_json(item_order_details, 'itemid STRING, itemname STRING') as items
FROM orders
result
and
SELECT order_id, items.itemid, items.itemname
FROM orders
LATERAL VIEW from_json(item_order_details, 'itemid STRING, itemname STRING') items as items
Error in SQL statement: AnalysisException: from_json is expected to be a generator. However, its class is org.apache.spark.sql.catalyst.expressions.JsonToStructs, which is not a generator
12-16-2022 12:53 PM
If you have a struct type and not json, you can explode the column and then extract the parts. https://spark.apache.org/docs/latest/api/sql/index.html#explode will unpack or flatmap the array. Just select order and explode("item_order_details")
12-21-2022 01:08 AM
Hi @Du Lu , We haven’t heard from you since the last response from @Joseph Kambourakis and I was checking back to see if his suggestions helped you.
Or else, If you have any solution, please share it with the community, as it can be helpful to others.
Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.
01-05-2023 09:49 PM
you need to use explode function
https://stackoverflow.com/questions/61070630/spark-explode-column-with-json-array-to-rows
Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections.
Click here to register and join today!
Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.