Hi Community,
Hope someonne can help with this DLT question.
I am currently working in a Databricks environment using Delta Live Tables (DLT) with Unity Catalog enabled, and I'm encountering a blocker related to schema evolution and checkpoint metadata.
I am developing a streaming DLT pipeline in Python to build out dimensional and fact tables iteratively. During this process, I frequently drop and re-create tables (e.g. dev.dds.dim_office) to refine schema and logic. However, I'm running into the following error when trying to re-run the pipeline after modifying the schema:
com.databricks.pipelines.common.errors.DLTAnalysisException: Table 'dev.dds.dim_office' has a user-specified schema that is incompatible with the schema inferred from its query.
Streaming tables are stateful and remember data that has already been processed. If you want to recompute the table from scratch, please full refresh the table.
Declared schema:
root
|-- id: long (nullable = true)
|-- account_siv_id: string (nullable = true)
Inferred schema:
root
|-- account_type: string (nullable = false)
|-- account_siv_id: string (nullable = true)
I’ve attempted to reset the pipeline by dropping the table using:
DROP TABLE dev.dds.dim_office PURGE;
This appears to drop the table successfully, but does not remove the associated checkpoint metadata. I have also tried manually deleting the checkpoint folder:
dbutils.fs.rm("abfss://test@xxxxxx.dfs.core.windows.net/managed/__unitystorage/catalogs/<catalog_id>/checkpoints/dim_office", recurse=True)
However, this returns an error:
overlaps with managed storage within 'CheckPathAccess' call
It seems that checkpoint metadata is not being fully cleared when the table is dropped, and I am unable to force a fresh recomputation or schema reset due to this residual state.
My Use Case:
As part of the early development/POC phase, I need to be able to iterate quickly—dropping and recreating tables (including schema changes) without residual metadata interference. This is proving to be a major limitation for adopting DLT pipelines in production under Unity Catalog.
Request/Advice Sought:
How to fully reset the DLT table, including its schema, checkpoint, and lineage metadata.