cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Declarative Pipelines: set Merge Schema to False

a_user12
New Contributor III

Dear Team!

I want to prevent at a certain table that the schema is automatically updated. With plain strucutred streaming I can do the following:

silver_df.writeStream \
    .format("delta") \
    .option("mergeSchema", "false") \
    .option("checkpointLocation", checkpoint_path) \
    .outputMode("append") \
    .table("silver_table")

How can I set mergeSchema=false with Declarative Pipelines?

2 REPLIES 2

szymon_dybczak
Esteemed Contributor III

Hi @a_user12 ,

Did you try something like in a link below? Of course in your case you want to set it to "false":

H Learn Data Engineering: Databricks Delta Live Table | by THE BRICK LEARNING | Medium

szymon_dybczak_0-1764517519244.png

 

a_user12
New Contributor III

@szymon_dybczak  - thank you for your response 

I try:

 

 
@dlt.table(
    name="deserialized",
    comment="Raw messages from Kafka topic as JSON",
    table_properties={
        "pipelines.autoOptimize.managed": "true",
        "pipelines.autoCompact.managed": "true"
    }
)
def deserialize():
    # Read from Kafka
    return spark.readStream \
            .table("stringified") \
            .withColumn("payload", from_json(col("payload"),None,{"schemaLocationKey": "x"})) \
            .select("topic","timestamp","payload") \
            .withColumn("new-x",lit("foo"))
    



@dlt.table(
    name="enriched_table",
    table_properties={
        "pipelines.autoOptimize.managed": "true",
        "pipelines.autoCompact.managed": "true"
    }
)
def enriched_table():
    return spark.readStream.option("mergeSchema","false").table("deserialized")
        #.withColumn("new",lit("new"))  # ensure columns match exactly    

I would expect, that if the attribute "nex-x" is not existing in the table "enriched table" yet I get an error. Indeed, it is simply adding the new column in the "neriched table".