4 weeks ago
Hi,
A column was deleted on the source table, when I ran LSDP it failed with error DELTA_STREAMING_INCOMPATIBLE_SCHEMA_CHANGE_USE_LOG : Streaming read is not supported on tables with read-incompatible schema changes( e.g: rename or drop or datatype changes)
what is the recommended approach to handle column deletions or datatype changes on source table in Lakeflow SDP.
3 weeks ago
Hi zahid6793,
Thanks for engaging with the thread. To address your question about the void datatype persisting in the Unity Catalog UI after a full refresh: this can happen when the target table metadata in Unity Catalog has not been fully rebuilt. A full refresh of the streaming table in the pipeline clears the checkpoint and reprocesses data, but if the previous table definition still carries stale column metadata, you may see void as a residual artifact.
A few things to verify after running a full refresh:
DESCRIBE TABLE EXTENDED <catalog.schema.table> to confirm the current schema matches expectations. If the dropped column still appears as void, the metadata has not fully updated.pipelines.reset.allowed = true) will force a clean metadata rebuild.The key distinction is that the streaming engine reads the Delta log sequentially from the checkpoint offset. Even if the current schema looks correct, if the log still contains the drop/rename transaction in the replay range, the stream will fail. A full refresh resets the checkpoint so it starts from the latest state, which should bypass the problematic log entry.
Sources:
3 weeks ago
HI @SteveOstrowski
Thanks for responding 🙂
Could you please let me know how to set pipelines.reset.allowed = true
2 weeks ago
Hi IM_01,
You can set pipelines.reset.allowed as a table property directly in your pipeline definition. The approach depends on whether you are using Python or SQL:
Python:
@dlt.table(
table_properties={"pipelines. reset.allowed": "true"}
)
def my_streaming_table():
return (
spark.readStream.format(" cloudFiles")
.option("cloudFiles.format", "json")
.load("/path/to/data")
)
SQL:
CREATE OR REFRESH STREAMING TABLE my_streaming_table
TBLPROPERTIES ("pipelines.reset.allowed" = "true")
AS SELECT * FROM STREAM(LIVE.source_table);
Setting this to "true" allows a full refresh of that specific table, which is what you need to resolve the DELTA_STREAMING_INCOMPATIBLE_ error. Once you set the property and do a full refresh to pick up the schema change, you may want to set it back to "false" afterward to prevent accidental full refreshes in production.
More detail on pipeline table properties is available in the docs: Pipeline table properties.
Sources:
2 weeks ago
Thanks @SteveOstrowski now its working
yesterday - last edited yesterday
This looks like a very practical template, especially for teams trying to structure their Data & AI strategy without overcomplicating things. The step-by-step format and examples should be really helpful for workshops and collaborative sessions. Curious — have you tested this with cross-functional teams (e.g., marketing + data + product)? Would be interesting to know how it performs in mixed groups.