<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Delta Live Tables - CDC - Batching - Delta Tables in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/delta-live-tables-cdc-batching-delta-tables/m-p/91300#M38139</link>
    <description>&lt;DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;Replace the target block with this and test&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;dlt.create_streaming_table("table_old")&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;/DIV&gt;</description>
    <pubDate>Sat, 21 Sep 2024 22:39:38 GMT</pubDate>
    <dc:creator>gchandra</dc:creator>
    <dc:date>2024-09-21T22:39:38Z</dc:date>
    <item>
      <title>Delta Live Tables - CDC - Batching - Delta Tables</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-live-tables-cdc-batching-delta-tables/m-p/91282#M38128</link>
      <description>&lt;P&gt;Hey Folks, I'm trying to implement CDC - Apply changes from one delta table to another. Source is&amp;nbsp; a delta table named table_latest and target is another delta table named table_old. Both are delta tables in databricks. Im trying to cascade the incremental changes from table_latest to table_old using DLT.&amp;nbsp; Below is my code Im using:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; dlt&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;from&lt;/SPAN&gt;&lt;SPAN&gt; pyspark.sql.functions &lt;/SPAN&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; col&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;@&lt;/SPAN&gt;&lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;table&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;name&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;"source_table_dlt"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;source_table&lt;/SPAN&gt;&lt;SPAN&gt;():&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; spark.read.&lt;/SPAN&gt;&lt;SPAN&gt;format&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"delta"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;table&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"table_latest"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; )&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;@&lt;/SPAN&gt;&lt;SPAN&gt;dlt&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;table&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;name&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;"target_table_dlt"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;def&lt;/SPAN&gt; &lt;SPAN&gt;target_table&lt;/SPAN&gt;&lt;SPAN&gt;():&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;return&lt;/SPAN&gt;&lt;SPAN&gt; (&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; spark.read.&lt;/SPAN&gt;&lt;SPAN&gt;format&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"delta"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;table&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"table_old"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; )&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;dlt.&lt;/SPAN&gt;&lt;SPAN&gt;apply_changes&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;target&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"target_table_dlt"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;source&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"source_table_dlt"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;keys&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;[&lt;/SPAN&gt;&lt;SPAN&gt;"id"&lt;/SPAN&gt;&lt;SPAN&gt;],&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;sequence_by&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"import_date"&lt;/SPAN&gt;&lt;SPAN&gt;))&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;The source code seems to run successfully. But when I run the Delta Live Table&amp;nbsp; pipeline I get the following error:&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;AnalysisException: Cannot have multiple queries named `target_table` for `target_table`. Additional queries on that table must be named. Note that unnamed queries default to the same name as the table.,None,Map(),Map(),List(),List(),Map()).&lt;BR /&gt;&lt;BR /&gt;Am I missing something fundamental here ? Please help.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Sat, 21 Sep 2024 13:48:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-live-tables-cdc-batching-delta-tables/m-p/91282#M38128</guid>
      <dc:creator>aniruth1000</dc:creator>
      <dc:date>2024-09-21T13:48:30Z</dc:date>
    </item>
    <item>
      <title>Re: Delta Live Tables - CDC - Batching - Delta Tables</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-live-tables-cdc-batching-delta-tables/m-p/91300#M38139</link>
      <description>&lt;DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;Replace the target block with this and test&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;dlt.create_streaming_table("table_old")&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Sat, 21 Sep 2024 22:39:38 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-live-tables-cdc-batching-delta-tables/m-p/91300#M38139</guid>
      <dc:creator>gchandra</dc:creator>
      <dc:date>2024-09-21T22:39:38Z</dc:date>
    </item>
    <item>
      <title>Re: Delta Live Tables - CDC - Batching - Delta Tables</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-live-tables-cdc-batching-delta-tables/m-p/91314#M38147</link>
      <description>&lt;P&gt;Hi,&amp;nbsp; Thanks for your response - I tried doing the above and I assume my apply changes should&amp;nbsp; look like:&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;dlt.&lt;/SPAN&gt;&lt;SPAN&gt;apply_changes&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;target&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;"table_old"&lt;/SPAN&gt;&lt;SPAN&gt;, #&amp;lt;-- this is the actual delta table&amp;nbsp;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;source&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;"source_table_dlt"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;keys&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;[&lt;/SPAN&gt;&lt;SPAN&gt;"id"&lt;/SPAN&gt;&lt;SPAN&gt;],&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;sequence_by&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;col&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"import_date"&lt;/SPAN&gt;&lt;SPAN&gt;))&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;When I run the pipeline, I get the following error:&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;STRONG&gt;com.databricks.pipelines.common.errors.DLTAnalysisException: Could not materialize 'table_old' because a MANAGED table already exists with that name.&lt;/STRONG&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Am I getting this because table_old is a delta table and not a delta live table ? Appreciate any help in resolving this. TIA.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/DIV&gt;</description>
      <pubDate>Sun, 22 Sep 2024 04:51:06 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-live-tables-cdc-batching-delta-tables/m-p/91314#M38147</guid>
      <dc:creator>aniruth1000</dc:creator>
      <dc:date>2024-09-22T04:51:06Z</dc:date>
    </item>
    <item>
      <title>Re: Delta Live Tables - CDC - Batching - Delta Tables</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-live-tables-cdc-batching-delta-tables/m-p/91323#M38150</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/122195"&gt;@aniruth1000&lt;/a&gt;&amp;nbsp;,&lt;/P&gt;&lt;P&gt;When using delta live table pipelines, only the source table can be the delta table.&lt;BR /&gt;The target table must be fully managed by the DLT pipeline, including its creation and lifecycle.&lt;BR /&gt;&lt;BR /&gt;Let's say that you modified the code as suggested by&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/8379"&gt;@gchandra&lt;/a&gt;, and your code looks like below:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="python"&gt;import dlt
from pyspark.sql.functions import col


@dlt.table(name="source_table_dlt")
def source_table():
    return (
        spark.read.format("delta").table("table_latest")
    )

dlt.create_streaming_table("table_old")

dlt.apply_changes(
    target = "target_old",
    source = "source_table_dlt",
    keys=["id"],
    sequence_by= col("import_date"))&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The requirement is that target is not an existing delta table not created by DLT.&lt;/P&gt;&lt;P&gt;If a table the given name (target_old) name already exists as a managed Delta table (not created and managed by DLT), DLT will throw an error because it cannot take over the management of an existing managed table not created by it. This is what is happening in your case.&lt;/P&gt;&lt;P&gt;How to solve it?&lt;BR /&gt;&lt;BR /&gt;The requirements:&lt;/P&gt;&lt;P&gt;1. Your target table will be loaded with data from "table_latest" on a regular basis&lt;/P&gt;&lt;P&gt;2. Your target table must also contain data from "table_old"&lt;/P&gt;&lt;P&gt;The steps:&lt;/P&gt;&lt;P&gt;1. Create a dlt pipeline as above&lt;/P&gt;&lt;P&gt;2. Change the target table to a different table name, like "table_target"&lt;/P&gt;&lt;P&gt;3. Run a one-time data-backfill from table_old as described in &lt;A href="https://docs.databricks.com/en/delta-live-tables/flows.html#example-run-a-one-time-data-backfill" target="_self"&gt;the docs.&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 22 Sep 2024 08:39:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-live-tables-cdc-batching-delta-tables/m-p/91323#M38150</guid>
      <dc:creator>filipniziol</dc:creator>
      <dc:date>2024-09-22T08:39:46Z</dc:date>
    </item>
  </channel>
</rss>

