Lakeflow SDP failed with DELTA_STREAMING_INCOMPATI...

SteveOstrowski · a week ago

Hi zahid6793,

Thanks for engaging with the thread. To address your question about the void datatype persisting in the Unity Catalog UI after a full refresh: this can happen when the target table metadata in Unity Catalog has not been fully rebuilt. A full refresh of the streaming table in the pipeline clears the checkpoint and reprocesses data, but if the previous table definition still carries stale column metadata, you may see void as a residual artifact.

A few things to verify after running a full refresh:

Run DESCRIBE TABLE EXTENDED <catalog.schema.table> to confirm the current schema matches expectations. If the dropped column still appears as void, the metadata has not fully updated.
If the stale column persists, dropping and recreating the target table (or letting the pipeline recreate it via full refresh with pipelines.reset.allowed = true) will force a clean metadata rebuild.
Check that the source table's Delta log no longer contains the non-additive schema change in the range the stream needs to replay. The error is a read-side failure triggered by encountering that transaction in the log history, not by the current schema state.

The key distinction is that the streaming engine reads the Delta log sequentially from the checkpoint offset. Even if the current schema looks correct, if the log still contains the drop/rename transaction in the replay range, the stream will fail. A full refresh resets the checkpoint so it starts from the latest state, which should bypass the problematic log entry.

Sources:

IM_01 · a week ago

HI @SteveOstrowski
Thanks for responding 🙂
Could you please let me know how to set pipelines.reset.allowed = true

SteveOstrowski · a week ago

Hi IM_01,

You can set pipelines.reset.allowed as a table property directly in your pipeline definition. The approach depends on whether you are using Python or SQL:

Python:

@dlt.table(
    table_properties={"pipelines.reset.allowed": "true"}
)
def my_streaming_table():
    return (
        spark.readStream.format("cloudFiles")
        .option("cloudFiles.format", "json")
        .load("/path/to/data")
    )

SQL:

CREATE OR REFRESH STREAMING TABLE my_streaming_table
TBLPROPERTIES ("pipelines.reset.allowed" = "true")
AS SELECT * FROM STREAM(LIVE.source_table);

Setting this to "true" allows a full refresh of that specific table, which is what you need to resolve the DELTA_STREAMING_INCOMPATIBLE_SCHEMA_CHANGE_USE_LOG error. Once you set the property and do a full refresh to pick up the schema change, you may want to set it back to "false" afterward to prevent accidental full refreshes in production.

More detail on pipeline table properties is available in the docs: Pipeline table properties.

Sources:

Databricks Docs - Pipeline table properties

View solution in original post

IM_01 · Wednesday

Thanks @SteveOstrowski now its working