cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Ways to validate final Dataframe schema against JSON schema config file

SailajaB
Valued Contributor III

Hi Team,

We have to validate transformed dataframe output schema with json schema config file.

Here is the scenario

Our input json schema and target json schema are different. Using Databricks we are doing the required schema changes. Now, we need to validate final dataframe schema against target JSON schema config file.

Note : JSON schema is very complex (it contains upto 7 level differences between input and output)

We tried with few python libraries but all are working fine for simple schemas without any issues.

We are looking for an approach like if we have any way to convert complex JSON schema config to sample json data (i.e target json schema config) so we can easily validate against final dataframe schema.

Thank you

3 REPLIES 3

weldermartins
Honored Contributor

Hello, for several levels you can use the functions explode(array(desired-level)).

/* A little bit for your understanding. */
df =   testDF\
.withColumn("company",explode(array("company")))\
.withColumn("employees",explode(array("company.employees")))\

SailajaB
Valued Contributor III

Hi @welder martins​ ,

Thank you for your reply..

We are looking for ways to validate the output dataframe schema against the JSON schema config.

I hope above one will be useful if we need to flatten the nested json structure.

Anonymous
Not applicable

@Sailaja B​ - Hi! My name is Piper, and I'm a moderator for the community. Thanks for your question. Please let us know how things go. If @welder martins​' response answers your question, would you be happy to come back and mark their answer as best? That will help other members find the answer more quickly.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group