- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
a week ago
Hi zahid6793,
Thanks for engaging with the thread. To address your question about the void datatype persisting in the Unity Catalog UI after a full refresh: this can happen when the target table metadata in Unity Catalog has not been fully rebuilt. A full refresh of the streaming table in the pipeline clears the checkpoint and reprocesses data, but if the previous table definition still carries stale column metadata, you may see void as a residual artifact.
A few things to verify after running a full refresh:
- Run
DESCRIBE TABLE EXTENDED <catalog.schema.table>to confirm the current schema matches expectations. If the dropped column still appears asvoid, the metadata has not fully updated. - If the stale column persists, dropping and recreating the target table (or letting the pipeline recreate it via full refresh with
pipelines.reset.allowed = true) will force a clean metadata rebuild. - Check that the source table's Delta log no longer contains the non-additive schema change in the range the stream needs to replay. The error is a read-side failure triggered by encountering that transaction in the log history, not by the current schema state.
The key distinction is that the streaming engine reads the Delta log sequentially from the checkpoint offset. Even if the current schema looks correct, if the log still contains the drop/rename transaction in the replay range, the stream will fail. A full refresh resets the checkpoint so it starts from the latest state, which should bypass the problematic log entry.
Sources:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
a week ago
HI @SteveOstrowski
Thanks for responding 🙂
Could you please let me know how to set pipelines.reset.allowed = true
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
a week ago
Hi IM_01,
You can set pipelines.reset.allowed as a table property directly in your pipeline definition. The approach depends on whether you are using Python or SQL:
Python:
@dlt.table(
table_properties={"pipelines. reset.allowed": "true"}
)
def my_streaming_table():
return (
spark.readStream.format(" cloudFiles")
.option("cloudFiles.format", "json")
.load("/path/to/data")
)
SQL:
CREATE OR REFRESH STREAMING TABLE my_streaming_table
TBLPROPERTIES ("pipelines.reset.allowed" = "true")
AS SELECT * FROM STREAM(LIVE.source_table);
Setting this to "true" allows a full refresh of that specific table, which is what you need to resolve the DELTA_STREAMING_INCOMPATIBLE_ error. Once you set the property and do a full refresh to pick up the schema change, you may want to set it back to "false" afterward to prevent accidental full refreshes in production.
More detail on pipeline table properties is available in the docs: Pipeline table properties.
Sources:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Wednesday
Thanks @SteveOstrowski now its working