<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Lakeflow Connect - SQL Server - Issues restarting after failure in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/lakeflow-connect-sql-server-issues-restarting-after-failure/m-p/154811#M54145</link>
    <description>&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I've done some research internally and found the thing that usually catches people out is that Lakeflow Connect for SQL Server isn't one pipeline, it actually has a few different components.&lt;/P&gt;
&lt;P&gt;There's a gateway (talks to SQL Server, writes to staging) and a separate ingestion pipeline (reads staging, writes to UC). If you only destroyed the ingestion pipeline, the gateway could still sitting on the broken state. &lt;BR /&gt;&lt;BR /&gt;A few things to make sure you've deleted: &lt;BR /&gt;&lt;BR /&gt;- Both pipelines: destroy the gateway too, not just the ingestion side. &lt;BR /&gt;- SQL Server side: Lakeflow recreates its own lakeflow_* capture instances on full refresh, but a broken one can stick around. Check cdc.change_tables and EXEC sys.sp_cdc_help_change_data_capture for stale lakeflow_* rows. Disabling + re-enabling CDC on the source table usually clears it.&lt;BR /&gt;- UC side: destination tables drop on pipeline delete, but staging volumes hang around for 25–30 days, and any schemas the pipeline created in UC don't always go with it. Worth dropping the staging schema/volume before you recreate. &lt;BR /&gt;&lt;BR /&gt;Order of operations that should work: stop + delete ingestion pipeline → stop + delete gateway → clean up SQL Server CDC on the affected tables → drop residual UC staging schemas → recreate gateway → recreate ingestion pipeline (fresh staging location if you can).&lt;BR /&gt;&lt;BR /&gt;Also: open a support case. The docs say full refresh should recover from INCOMPATIBLE_SCHEMA_CHANGE and that destination tables drop on delete, if this isn't happening then our engineers should know about it. Include the original DDL diff, the failure event log, and what failed on both the refresh and the recreate.&lt;/P&gt;
&lt;P&gt;I hope this helps.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;&lt;BR /&gt;Emma&lt;/P&gt;</description>
    <pubDate>Fri, 17 Apr 2026 14:39:41 GMT</pubDate>
    <dc:creator>emma_s</dc:creator>
    <dc:date>2026-04-17T14:39:41Z</dc:date>
    <item>
      <title>Lakeflow Connect - SQL Server - Issues restarting after failure</title>
      <link>https://community.databricks.com/t5/data-engineering/lakeflow-connect-sql-server-issues-restarting-after-failure/m-p/154685#M54124</link>
      <description>&lt;P&gt;Has anyone else run into a situation where a breaking schema change on a SQL Server source table leaves their Lakeflow Connect pipeline in a state it can't recover from — even after destroying and recreating the pipeline?&lt;/P&gt;&lt;P&gt;Here's what happened to us:&lt;/P&gt;&lt;P&gt;- We had a breaking schema change on one of our source tables&lt;BR /&gt;- The CDC incrementals broke as expected, so we triggered a full refresh&lt;BR /&gt;- The pipeline never stabilized — it continued to fail on subsequent runs&lt;BR /&gt;- We destroyed the pipeline and recreated it from scratch&lt;BR /&gt;- Even after recreating, we were unable to get the pipeline to a healthy state&lt;/P&gt;&lt;P&gt;I'm aware that Databricks docs acknowledge that rows prior to a schema change aren't guaranteed to have been ingested before the pipeline fails (`INCOMPATIBLE_SCHEMA_CHANGE`). But the expectation is that a full refresh — or at minimum a destroy and recreate — should get you back to a clean state. In our experience, neither did.&lt;/P&gt;</description>
      <pubDate>Wed, 15 Apr 2026 21:49:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/lakeflow-connect-sql-server-issues-restarting-after-failure/m-p/154685#M54124</guid>
      <dc:creator>lrm_data</dc:creator>
      <dc:date>2026-04-15T21:49:46Z</dc:date>
    </item>
    <item>
      <title>Re: Lakeflow Connect - SQL Server - Issues restarting after failure</title>
      <link>https://community.databricks.com/t5/data-engineering/lakeflow-connect-sql-server-issues-restarting-after-failure/m-p/154811#M54145</link>
      <description>&lt;P&gt;Hi,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I've done some research internally and found the thing that usually catches people out is that Lakeflow Connect for SQL Server isn't one pipeline, it actually has a few different components.&lt;/P&gt;
&lt;P&gt;There's a gateway (talks to SQL Server, writes to staging) and a separate ingestion pipeline (reads staging, writes to UC). If you only destroyed the ingestion pipeline, the gateway could still sitting on the broken state. &lt;BR /&gt;&lt;BR /&gt;A few things to make sure you've deleted: &lt;BR /&gt;&lt;BR /&gt;- Both pipelines: destroy the gateway too, not just the ingestion side. &lt;BR /&gt;- SQL Server side: Lakeflow recreates its own lakeflow_* capture instances on full refresh, but a broken one can stick around. Check cdc.change_tables and EXEC sys.sp_cdc_help_change_data_capture for stale lakeflow_* rows. Disabling + re-enabling CDC on the source table usually clears it.&lt;BR /&gt;- UC side: destination tables drop on pipeline delete, but staging volumes hang around for 25–30 days, and any schemas the pipeline created in UC don't always go with it. Worth dropping the staging schema/volume before you recreate. &lt;BR /&gt;&lt;BR /&gt;Order of operations that should work: stop + delete ingestion pipeline → stop + delete gateway → clean up SQL Server CDC on the affected tables → drop residual UC staging schemas → recreate gateway → recreate ingestion pipeline (fresh staging location if you can).&lt;BR /&gt;&lt;BR /&gt;Also: open a support case. The docs say full refresh should recover from INCOMPATIBLE_SCHEMA_CHANGE and that destination tables drop on delete, if this isn't happening then our engineers should know about it. Include the original DDL diff, the failure event log, and what failed on both the refresh and the recreate.&lt;/P&gt;
&lt;P&gt;I hope this helps.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;&lt;BR /&gt;Emma&lt;/P&gt;</description>
      <pubDate>Fri, 17 Apr 2026 14:39:41 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/lakeflow-connect-sql-server-issues-restarting-after-failure/m-p/154811#M54145</guid>
      <dc:creator>emma_s</dc:creator>
      <dc:date>2026-04-17T14:39:41Z</dc:date>
    </item>
    <item>
      <title>Re: Lakeflow Connect - SQL Server - Issues restarting after failure</title>
      <link>https://community.databricks.com/t5/data-engineering/lakeflow-connect-sql-server-issues-restarting-after-failure/m-p/154856#M54148</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/227217"&gt;@lrm_data&lt;/a&gt;&amp;nbsp;y&lt;SPAN&gt;es, this one catches a lot of people. A few things to check on the SQL Server side that commonly block recovery even after destroy + recreate:&lt;/SPAN&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;STRONG&gt;Stale lakeflow_*&lt;/STRONG&gt; capture instance. SQL Server allows only 2 capture instances per table. If both slots are occupied - often a live one plus a stale one left from the failed schema change - Lakeflow can't do full refresh or schema evolution, even on a fresh pipeline. Check:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;SELECT capture_instance, start_lsn, create_date&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;FROM cdc.change_tables ct&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;JOIN sys.tables t ON ct.source_object_id = t.object_id&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;WHERE t.name = '&lt;/SPAN&gt;&lt;SPAN&gt;&amp;lt;&lt;/SPAN&gt;&lt;SPAN&gt;your_table&lt;/SPAN&gt;&lt;SPAN&gt;&amp;gt;&lt;/SPAN&gt;&lt;SPAN&gt;';&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Disable + re-enable CDC on the table to clear stale ones.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&lt;STRONG&gt;Lakeflow Connect&lt;/STRONG&gt; is two pipelines. A gateway (SQL Server -&amp;gt; UC staging) and an ingestion pipeline (staging -&amp;gt; destination). Destroying only the ingestion side leaves the gateway on the broken state. Both must go.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Staging volume doesn't always drop cleanly. Volumes persist 25–30 days after deletion, and UC schemas the pipeline created can hang around. Reusing the old staging location is the #1 reason "recreated from scratch" isn't actually from scratch. Drop it manually and use a fresh one.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Recovery order that works:&lt;/SPAN&gt;&lt;/DIV&gt;&lt;OL&gt;&lt;LI&gt;&lt;SPAN&gt;Stop + delete ingestion pipeline&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Stop + delete gateway&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Disable CDC on the table, confirm no stale lakeflow_* instances remain, re-enable CDC&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Drop the residual UC staging schema/volume&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Recreate gateway with a fresh staging location&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;Recreate ingestion pipeline&lt;/SPAN&gt;&lt;/LI&gt;&lt;/OL&gt;&lt;DIV&gt;&lt;SPAN&gt;If the table has a primary key, switch to change tracking (CT) instead of CDC - Databricks' own recommendation. CT doesn't use capture instances and this whole class of problem goes away.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;If all the above is clean and it still won't recover, open a support case with the utility script version, capture instance state before/after, and gateway driver logs.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Sat, 18 Apr 2026 05:03:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/lakeflow-connect-sql-server-issues-restarting-after-failure/m-p/154856#M54148</guid>
      <dc:creator>abhi_dabhi</dc:creator>
      <dc:date>2026-04-18T05:03:43Z</dc:date>
    </item>
  </channel>
</rss>

