Schema Evolution in Azure databricks

CBL — Wed, 28 Feb 2024 18:15:25 GMT

Hi All -

In my scenario, Loading data from 100 of Json files.

Problem is, fields/columns are missing when JSON file contains new fields.

Full Load:

while writing JSON to delta use the option ("mergeschema", "true") so that we do not miss new columns

Incremental Load:

Problem is here as schema does not match with existing schema.

Could you please assist with schema comparison while doing incremental load.

New JSON files schema should compare with existing JSON files schema.

Re: Schema Evolution in Azure databricks

cgrant — Mon, 13 Jan 2025 22:03:39 GMT

For these scenarios, you can use schema evolution capabilities like mergeSchema or opt to use the new VariantType to avoid requiring a schema at time of ingest.