Databricks

zyang · ‎07-21-2022

I am using the schema evolution in the delta table and the code is written in databricks notebook.

 df.write
        .format("delta")
        .mode("append")
        .option("mergeSchema", "true")
        .partitionBy("date")
        .save(path)

But I still got the error below. Is it correct to define the schema and enable the mergeSchema at the same time?

AnalysisException: The specified schema does not match the existing schema at path.
== Specified ==
 
    root
    -- A: string (nullable = false)
    -- B: string (nullable = true)
    -- C: long (nullable = true)
    
== Existing ==
    root
    -- A: string (nullable = true)
    -- B: string (nullable = true)
    -- C: long (nullable = true)
 
== Differences==
- Field A is non-nullable in specified schema but nullable in existing schema.
 
If your intention is to keep the existing schema, you can omit the
schema from the create table command. Otherwise please ensure that
the schema matches.

Noopur_Nigam · ‎08-30-2022

Hi @z yang Please provide the df creation code as well to understand the complete exception and scenario.

Databricks

pyspark delta table schema evolution

Unity Catalog Lakeguard: Industry-first and only data governance for multi-user Apache™ Spark cluste

Announcing the General Availability of Databricks Asset Bundles

Register now and save 50% on training at Data + AI Summit!

How to successfully build GenAI applications

Meet DBRX, the New Standard for High-Quality LLMs