Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-31-2026 12:40 PM
Hi everyone,
I’m dealing with a scenario combining Delta Live Tables, CDC ingestion, and streaming pipelines, and I’ve hit a challenge that I haven’t seen clearly addressed in the docs.
Some Context:
- Source is an upstream system emitting CDC events (insert/update/delete)
- Data is ingested via Auto Loader into a bronze layer
- From there, I’m using DLT to build silver tables with merge logic (SCD Type 1)
- The pipeline runs in continuous/streaming mode
The issue is around schema evolution, especially breaking changes:
- column type changes (e.g., int → string)
- column drops or renames
- nested structure changes
While Auto Loader can handle schema evolution to some extent, downstream DLT transformations (especially merges) tend to fail or behave unpredictably when these changes occur.
My concerns:
- avoiding pipeline failures in production
- maintaining data quality and historical consistency
- not overcomplicating the pipeline with excessive manual handling
Questions:
- What’s the best pattern to handle breaking schema changes in this setup?
- Do you isolate schema evolution strictly in bronze and enforce contracts from silver onward?
- Has anyone implemented schema versioning or schema registry-like patterns with DLT?
- How do you balance flexibility (auto evolution) vs governance (strict schemas)?
Would really appreciate insights from anyone who has dealt with this in production.
Thanks!