Databricks Community

yazz · ‎07-13-2025

Description:
I’m migrating a two-stage streaming job into Delta Live Tables (DLT):

Bronze: read from Pub/Sub → write to Bronze table
Silver: use create_auto_cdc_flow on Bronze → upsert into Silver table

New data works perfectly, but I now need to backfill history into the same Silver table. I’m blocked by two DLT constraints:

Single-flow target: you can’t have two separate flows write to the same table
No mixed modes: you can’t combine an append-only flow (preview) with create_auto_cdc_flow on one target

I tried widget-driven conditional logic to switch between backfill and CDC, but no data is written during backfill.

Request:
Has anyone backfilled historical data into a DLT-managed CDC table? What workarounds or patterns did you use to load history without conflicting with the live CDC flow? Any code snippets or best practices welcome.

szymon_dybczak · ‎07-14-2025

Hi @yazz ,

I’m wondering if you could use a similar approach to the one in the article below. So, just backfill your bronze table first. Then, the downstream silver and gold layers will pick up the new data from the bronze layer.
In that approach you don't need to look for workarounds regarding DLT constraints (single-flow target and no mixed modes)

Backfilling historical data with Lakeflow Declarative Pipelines - Azure Databricks | Microsoft Learn

yazz · ‎07-14-2025

Thanks for the reply, Szymon.

Yes, I have seen the example, but its just that my bronze table's schema is different from my historical backup table. The bronze table has 4 columns and has json data in the columns, while historical has 11 columns with structured data. I cannot load this data into the bronze.

Also, the example talks only about append_flow. Technically, I need a pipeline that appends only once into the silver from backup table and then another delta pipeline that keeps upserting into the silver table.

Databricks Community

Converting Existing Streaming Job to Delta Live Tables with Historical Backfill

Join Us as a Local Community Builder!

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

🌟 Community Pulse: Your Weekly Roundup! November 14 – 20, 2025

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples