cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Ways to validate final Dataframe schema against JSON schema config file

SailajaB
Valued Contributor III

Hi Team,

We have to validate transformed dataframe output schema with json schema config file.

Here is the scenario

Our input json schema and target json schema are different. Using Databricks we are doing the required schema changes. Now, we need to validate final dataframe schema against target JSON schema config file.

Note : JSON schema is very complex (it contains upto 7 level differences between input and output)

We tried with few python libraries but all are working fine for simple schemas without any issues.

We are looking for an approach like if we have any way to convert complex JSON schema config to sample json data (i.e target json schema config) so we can easily validate against final dataframe schema.

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz
Community Manager
Community Manager

Hi @Sailaja B​ , Please go through this link for the related query.

View solution in original post

4 REPLIES 4

weldermartins
Honored Contributor

Hello, for several levels you can use the functions explode(array(desired-level)).

/* A little bit for your understanding. */
df =   testDF\
.withColumn("company",explode(array("company")))\
.withColumn("employees",explode(array("company.employees")))\

SailajaB
Valued Contributor III

Hi @welder martins​ ,

Thank you for your reply..

We are looking for ways to validate the output dataframe schema against the JSON schema config.

I hope above one will be useful if we need to flatten the nested json structure.

Kaniz
Community Manager
Community Manager

Hi @Sailaja B​ , Please go through this link for the related query.

Anonymous
Not applicable

@Sailaja B​ - Hi! My name is Piper, and I'm a moderator for the community. Thanks for your question. Please let us know how things go. If @welder martins​' response answers your question, would you be happy to come back and mark their answer as best? That will help other members find the answer more quickly.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.