Lakeflow Connect - Pending ‘full refresh’ process that needs to be removed in gateway pipeline.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yesterday
Hello, we have the following issue that we have been unable to resolve.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
yesterday
Based on the events you've shared, it does appear that the gateway is recognizing the configuration change (Tables removed: Gestiones) but is still attempting to process a previously initiated snapshot request for that table.
A few things stand out:
The table was successfully removed from the managed ingestion pipeline definition.
The gateway reinitialized and acknowledged the removal.
Despite that, a subsequent SNAPSHOT_STARTED event was emitted for gateway_cdc_Gestiones.
There are no corresponding completion, cancellation, termination, or removal events for that flow.
That sequence suggests there may be a pending snapshot request associated with the earlier full refresh that was not fully cleared when the table was removed from the pipeline configuration.
At this point, I am not aware of a documented self-service procedure to surgically remove a single pending snapshot or orphaned flow from the gateway state. If the gateway continues to start gateway_cdc_Gestiones after the table has been removed from the ingestion definition, this feels more like a stale gateway-state issue than a configuration issue.
Given the impact on the remaining replicated tables, I would recommend opening a Databricks support case and including:
Gateway pipeline name and ID
Managed ingestion pipeline name and ID
The snapshot_request_timestamp value
The REINITIALIZING event showing the table removal
The subsequent SNAPSHOT_STARTED event for gateway_cdc_Gestiones
Confirmation that the table was removed from the ingestion definition, and a note that source-side CDC for Gestiones was subsequently disabled (in case that affects how the pending flow can be cleared)
That should give the team enough information to determine whether there is a stale snapshot request that needs to be cleared from the gateway state.
One caveat: leaving the gateway stopped indefinitely is not risk-free either, since CDC changes can eventually age out of the source retention window and force additional full refreshes for other tables. If the gateway must remain paused while this is investigated, it may be worth discussing with Databricks Support what the safest path is to resume replication for the remaining tables without re-triggering a snapshot for Gestiones.
The key question I'd like clarified is whether removing a table from a managed ingestion pipeline is expected to automatically cancel any queued or in-progress snapshot requests, or whether additional cleanup of gateway state is required when a full refresh has already been initiated.