cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Error : . If you expect to delete or update rows to the source table in the future.......

rt-slowth
Contributor

Flow 'user_silver' has FAILED fatally. An error occurred because we detected an update or delete to one or more rows in the source table. Streaming tables may only use append-only streaming sources. If you expect to delete or update rows to the source table in the future, please convert table user_silver to a live table instead of a streaming live table. To resolve this issue, perform a Full Refresh to table user_silver. A Full Refresh will attempt to clear all data from table user_silver and then load all data from the streaming source.

The non-append change can be found at version 11.
Operation: MERGE
Username: doheekim

 

 

When I create a silver table in Databricks that uses the table to which dlt.apply_changes is applied as the source data, why does the first run work fine, but an error is thrown saying that only appned is possible from the second run?
Source table name: user

3 REPLIES 3

@Retired_mod 
Can you give me some example code on how to switch to live tables using pyspark?

Palash01
Valued Contributor

Hey @rt-slowth 

Adding to @Retired_mod's points and refereeing to your bronze and silver code you posted in questions-about-the-design-of-bronze-silver-and-gold-for-live post. Looks like you are doing a SCD operation on your bronze layer which explains why the pipeline is erroring out in the subsequent run. To over come this you can try as explained by Kaniz in her post.

Also, the code to switch to live tables is using readStream and writeStream functions in your code.

For writing the streaming table: 

query = transformed_stream \
    .writeStream \
    .format("delta") \
    .option("checkpointLocation", "/databricks/dbfs/checkpoints") \
    .trigger(processingTime="5 minutes") \
    .start("/databricks/dbfs/live_tables/my_live_table")

For reading a streaming table:

source_stream = spark.readStream \
    .format("delta") \
    .option("path", "/databricks/dbfs/mnt/landing_zone") \
    .option("processingTime", "1 minute") \
    .load()

 

Leave a like if this helps! Kudos,
Palash

Palash01
Valued Contributor

Hey @rt-slowth 

Just checking in if the provided solution was helpful to you. If yes, please accept this as a Best Solution so that this thread can be considered closed.

Leave a like if this helps! Kudos,
Palash

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group