Databricks Community

SailajaB · ‎12-07-2021

Hi,

We are writing our flatten json dataframe to user defined nested schema json using pysprk in Databricks.But we are not getting the expected format

Expecting :

{"ID":"aaa",c_id":[{"con":null,"createdate":"2015-10-09T00:00:00Z","data":null,"id":"1"},{"con":null,"createdate":"2015-10-09T00:00:00Z","data":null,"id":"2"},{"con":null,"createdate":"2015-10-09T00:00:00Z","data":null,"id":"3"}]

But Getting :

{"ID":"aaa",c_id":{"con":null,"createdate":"2015-10-09T00:00:00Z","data":null,"id":"1"}},

{"ID":"aaa",c_id":{"con":null,"createdate":"2015-10-09T00:00:00Z","data":null,"id":"2"}},

{"ID":"aaa",c_id":{"con":null,"createdate":"2015-10-09T00:00:00Z","data":null,"id":"3"}}

We tried with group_by and collect list but not getting in expected format.

Could someone please help us is there any way to achieve it.

Thank you in advance

-werners- · ‎12-08-2021

I don't know what your code is, so you should probably share it.

And also the starting json

Hubert-Dudek · ‎12-08-2021

as @wereners said you need to share the code. If it is dataframe to json probably you need to use StructType - Array to get that list but without code is hard to help.

SailajaB · ‎12-08-2021

Hi,

Thank you for the reply..

Here I am sharing the code piece

df_global_op=df_global.withColumn("Definitions",struct((df_global.id).alias("ID"),\

struct((df_global.a).alias("con"),\

(df_global.b).alias("createdate"),\

(df_global.c).alias("data"),\

(df_global.d).alias("id")).\

alias("c_id"))).\

drop(*global_fields).select("Definitions.*").distinct().write.\

format("json").\

option("ignoreNullFields", "false").\

save("/mnt/test/op/12-08-2021")

Please be noted df_global is a flatten df of input json.. Here we are deriving output json on top of flatten one based on requested schema.

Thank you

SailajaB · ‎12-08-2021

@HubertDudek , @werners

Is there any way to resolve the above one?

Thank you

Databricks Community

facing format issue while converting one type nested json to other brand new json schema

Photos

Connect with Databricks Users in Your Area

Data + AI Summit 2025 — registration now open!

Jumpstart Your Data Journey with Databricks Get Started Days!

Databricks DevConnect: Global Community Meetups for Data Engineers

Intelligent Data Warehousing: AI/BI for Self-service Analytics

Introducing SAP Databricks