Databricks Community

zero234 · ‎02-16-2024

i have created a materialized view table using delta live table pipeline , for some reason it is overwriting data every day , i want it to append data to the table instead of doing full refresh suppose i had 8 million records in table and if i

run the pipeline it will remove those previous records and only put in new records. i want it to appends to already existing data i have tried using @Dlt.table(merge Mode="append")it throws unexpected keyword argument error

i have tried using @Dlt.table(merge Mode="append")it throws unexpected keyword argument error

what to do so my pipeline appends data

Kasen · ‎06-04-2024

Hi @Retired_mod,

In my DLT pipeline, I'm using DLT Classic Core as the resource. When I run the DLT pipeline (create silver layer from bronze) for the first time, it will create a Materialized view in silver layer. When there are some rows in the bronze layer been updated, I re-run the DLT pipeline again, I realized that the data in silver layer did reflect the latest changes from bronze layer. However, what I'm not clear is that the Materialized view in silver layer is doing a full refresh or just updating the rows that have changes? I couldn't find any source regarding this topic especially I'm using DLT Classic Core in DLT pipeline without CDC, appreciate your clarification, thank you!

kulkpd · ‎02-17-2024

@zero234 ,

Adding some suggestion based on answers from @Retired_mod. Important point to note here: "To define a materialized view in Python, apply @table to a query that performs a static read against a data source. To define a streaming table, apply @table to a query that performs a streaming read against a data source."

I think if you which to read by streaming mode, DLT will treat your destination as streaming table.