cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

i have created a materialized view table using delta live table pipeline and its not appending data

zero234
New Contributor III
i have created a materialized view table using delta live table pipeline , for some reason it is overwriting data every day , i want it to append data to the table instead of doing full refresh suppose i had 8 million records in table and if i
run the pipeline it will remove those previous records and only put in new records. i want it to appends to already existing data i have tried using @Dlt.table(merge Mode="append")it throws unexpected keyword argument error
i have tried using @Dlt.table(merge Mode="append")it throws unexpected keyword argument error
what to do so my pipeline appends data 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz_Fatma
Community Manager
Community Manager

Hi @zero234., To ensure that your Delta Live Table pipeline appends data instead of overwriting it, you can use the @append_flow decorator.

Here are the steps:

  1. Use @append_flow:

  2. Define Materialized Views or Streaming Tables:

  3. Example Usage:

    from deltatables import Dlt
    
    @Dlt.table
    def my_materialized_view():
        # Your query here (e.g., SELECT * FROM my_source_data)
        pass
    
    @Dlt.append_flow
    def my_streaming_pipeline():
        # Your streaming logic here
        pass
    
  4. Override Default Behavior:

 

View solution in original post

3 REPLIES 3

Kaniz_Fatma
Community Manager
Community Manager

Hi @zero234., To ensure that your Delta Live Table pipeline appends data instead of overwriting it, you can use the @append_flow decorator.

Here are the steps:

  1. Use @append_flow:

  2. Define Materialized Views or Streaming Tables:

  3. Example Usage:

    from deltatables import Dlt
    
    @Dlt.table
    def my_materialized_view():
        # Your query here (e.g., SELECT * FROM my_source_data)
        pass
    
    @Dlt.append_flow
    def my_streaming_pipeline():
        # Your streaming logic here
        pass
    
  4. Override Default Behavior:

 

Kasen
New Contributor II

Hi @Kaniz_Fatma,

In my DLT pipeline, I'm using DLT Classic Core as the resource. When I run the DLT pipeline (create silver layer from bronze) for the first time, it will create a Materialized view in silver layer. When there are some rows in the bronze layer been updated, I re-run the DLT pipeline again, I realized that the data in silver layer did reflect the latest changes from bronze layer. However, what I'm not clear is that the Materialized view in silver layer is doing a full refresh or just updating the rows that have changes? I couldn't find any source regarding this topic especially I'm using DLT Classic Core in DLT pipeline without CDC, appreciate your clarification, thank you!

kulkpd
Contributor

@zero234 ,

Adding some suggestion based on answers from @Kaniz_Fatma. Important point to note here: "To define a materialized view in Python, apply @table to a query that performs a static read against a data source. To define a streaming table, apply @table to a query that performs a streaming read against a data source."

I think if you which to read by streaming mode, DLT will treat your destination as streaming table.

 

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!