Simple append for a DLT
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-25-2025 02:34 PM
Looking for some help getting unstuck re: appending to DLTs in Databricks. I have successfully extracted data via API endpoint, done some initial data cleaning/processing, and subsequently stored that data in a DLT. Great start. But I noticed that each time the pipeline runs, all of the previous rows are overwritten. The AI assistant and separate google searches have proven worthless thus far to help me understand why I cannot simply append data from each run to the DLT. I manually added a timestamp column to ensure that each run's data is unique. And each time it runs, I can verify that the data is fresh. I just only see the new data (old is overwritten). According to my research, append is supposedly the default behavior when writing to a DLT, but that's not happening and I don't understand why. Attempts to explicitly define the append properties for the DLT (both in the notebook and pipeline settings) have not helped. Here is an simple example of what I'm trying (and failing) to do: