<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic 📊 Simplifying CDC with Databricks Delta Live Tables &amp;amp; Snapshots 📊 in Community Articles</title>
    <link>https://community.databricks.com/t5/community-articles/simplifying-cdc-with-databricks-delta-live-tables-amp-snapshots/m-p/89544#M263</link>
    <description>&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN&gt;In the world of data integration, synchronizing external relational databases (like Oracle, MySQL) with the Databricks platform can be complex, especially when Change Data Feed (CDF) streams aren’t available. Using snapshots is a powerful way to manage this!&lt;BR /&gt;&lt;BR /&gt;&lt;span class="lia-unicode-emoji" title=":small_blue_diamond:"&gt;🔹&lt;/span&gt; What are Snapshots? Snapshots capture the state of your data at a given time, making it easier to track changes over time and maintain consistency in your data lake.&lt;BR /&gt;&lt;span class="lia-unicode-emoji" title=":small_blue_diamond:"&gt;🔹&lt;/span&gt; SCD Type 1 &amp;amp; 2 Implementation Delta Live Tables (DLT) in Databricks simplifies handling Slowly Changing Dimensions (SCD) with two main approaches:&lt;BR /&gt;Snapshot Replacement: Overwrite the existing snapshot with a new one.&lt;BR /&gt;Snapshot Accumulation: Maintain multiple snapshots over time for a historical view.&lt;BR /&gt;DLT’s APPLY CHANGES FROM SNAPSHOT feature streamlines processing these snapshots, allowing you to store records as SCD Type 1 (overwrite) or Type 2 (track historical changes).&lt;BR /&gt;&lt;span class="lia-unicode-emoji" title=":small_blue_diamond:"&gt;🔹&lt;/span&gt; Push vs. Pull-Based Snapshots&lt;BR /&gt;Push-Based: Efficient and initiated directly from the source.&lt;BR /&gt;Pull-Based: More flexible but can be resource-intensive, ideal for large data sources.&lt;BR /&gt;&lt;BR /&gt;&lt;span class="lia-unicode-emoji" title=":hammer_and_wrench:"&gt;🛠&lt;/span&gt;️ Delta Live Tables Pipelines With DLT, you can efficiently process CDC data from full snapshots, applying logic to track changes in your data over time and support complex ETL pipelines.&lt;BR /&gt;&lt;span class="lia-unicode-emoji" title=":pushpin:"&gt;📌&lt;/span&gt; Whether you're managing customer data, tracking order history, or analyzing product changes, using snapshots in DLT with Databricks offers flexibility and performance.&lt;BR /&gt;&lt;BR /&gt;Wanted to implement - &lt;A href="https://www.databricks.com/blog/how-perform-change-data-capture-cdc-full-table-snapshots-using-delta-live-tables" target="_blank" rel="noopener"&gt;How to perform change data capture (CDC) from full table snapshots using Delta Live Tables&lt;/A&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN&gt;&lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=databricks&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;Databricks&lt;/A&gt; &lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=deltalivetables&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;DeltaLiveTables&lt;/A&gt; &lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=changedatacapture&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;ChangeDataCapture&lt;/A&gt; &lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=dataengineering&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;DataEngineering&lt;/A&gt; &lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=datasnapshots&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;DataSnapshots&lt;/A&gt; &lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=etl&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;ETL&lt;/A&gt; &lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=bigdata&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;BigData&lt;/A&gt; &lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=datapipeline&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;DataPipeline&lt;/A&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Pull-Based Snapshots.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/11095i2A308D55360DE292/image-size/large?v=v2&amp;amp;px=999" role="button" title="Pull-Based Snapshots.png" alt="Pull-Based Snapshots.png" /&gt;&lt;/span&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
    <pubDate>Thu, 12 Sep 2024 05:37:19 GMT</pubDate>
    <dc:creator>Ajay-Pandey</dc:creator>
    <dc:date>2024-09-12T05:37:19Z</dc:date>
    <item>
      <title>📊 Simplifying CDC with Databricks Delta Live Tables &amp; Snapshots 📊</title>
      <link>https://community.databricks.com/t5/community-articles/simplifying-cdc-with-databricks-delta-live-tables-amp-snapshots/m-p/89544#M263</link>
      <description>&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN&gt;In the world of data integration, synchronizing external relational databases (like Oracle, MySQL) with the Databricks platform can be complex, especially when Change Data Feed (CDF) streams aren’t available. Using snapshots is a powerful way to manage this!&lt;BR /&gt;&lt;BR /&gt;&lt;span class="lia-unicode-emoji" title=":small_blue_diamond:"&gt;🔹&lt;/span&gt; What are Snapshots? Snapshots capture the state of your data at a given time, making it easier to track changes over time and maintain consistency in your data lake.&lt;BR /&gt;&lt;span class="lia-unicode-emoji" title=":small_blue_diamond:"&gt;🔹&lt;/span&gt; SCD Type 1 &amp;amp; 2 Implementation Delta Live Tables (DLT) in Databricks simplifies handling Slowly Changing Dimensions (SCD) with two main approaches:&lt;BR /&gt;Snapshot Replacement: Overwrite the existing snapshot with a new one.&lt;BR /&gt;Snapshot Accumulation: Maintain multiple snapshots over time for a historical view.&lt;BR /&gt;DLT’s APPLY CHANGES FROM SNAPSHOT feature streamlines processing these snapshots, allowing you to store records as SCD Type 1 (overwrite) or Type 2 (track historical changes).&lt;BR /&gt;&lt;span class="lia-unicode-emoji" title=":small_blue_diamond:"&gt;🔹&lt;/span&gt; Push vs. Pull-Based Snapshots&lt;BR /&gt;Push-Based: Efficient and initiated directly from the source.&lt;BR /&gt;Pull-Based: More flexible but can be resource-intensive, ideal for large data sources.&lt;BR /&gt;&lt;BR /&gt;&lt;span class="lia-unicode-emoji" title=":hammer_and_wrench:"&gt;🛠&lt;/span&gt;️ Delta Live Tables Pipelines With DLT, you can efficiently process CDC data from full snapshots, applying logic to track changes in your data over time and support complex ETL pipelines.&lt;BR /&gt;&lt;span class="lia-unicode-emoji" title=":pushpin:"&gt;📌&lt;/span&gt; Whether you're managing customer data, tracking order history, or analyzing product changes, using snapshots in DLT with Databricks offers flexibility and performance.&lt;BR /&gt;&lt;BR /&gt;Wanted to implement - &lt;A href="https://www.databricks.com/blog/how-perform-change-data-capture-cdc-full-table-snapshots-using-delta-live-tables" target="_blank" rel="noopener"&gt;How to perform change data capture (CDC) from full table snapshots using Delta Live Tables&lt;/A&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN&gt;&lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=databricks&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;Databricks&lt;/A&gt; &lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=deltalivetables&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;DeltaLiveTables&lt;/A&gt; &lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=changedatacapture&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;ChangeDataCapture&lt;/A&gt; &lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=dataengineering&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;DataEngineering&lt;/A&gt; &lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=datasnapshots&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;DataSnapshots&lt;/A&gt; &lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=etl&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;ETL&lt;/A&gt; &lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=bigdata&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;BigData&lt;/A&gt; &lt;A class="" href="https://www.linkedin.com/feed/hashtag/?keywords=datapipeline&amp;amp;highlightedUpdateUrns=urn%3Ali%3Aactivity%3A7239868020863463426" target="_blank" rel="noopener"&gt;&lt;SPAN&gt;#&lt;/SPAN&gt;DataPipeline&lt;/A&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV class=""&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class=""&gt;&lt;SPAN class=""&gt;&lt;SPAN&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Pull-Based Snapshots.png" style="width: 999px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/11095i2A308D55360DE292/image-size/large?v=v2&amp;amp;px=999" role="button" title="Pull-Based Snapshots.png" alt="Pull-Based Snapshots.png" /&gt;&lt;/span&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Thu, 12 Sep 2024 05:37:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/community-articles/simplifying-cdc-with-databricks-delta-live-tables-amp-snapshots/m-p/89544#M263</guid>
      <dc:creator>Ajay-Pandey</dc:creator>
      <dc:date>2024-09-12T05:37:19Z</dc:date>
    </item>
    <item>
      <title>Re: 📊 Simplifying CDC with Databricks Delta Live Tables &amp; Snapshots 📊</title>
      <link>https://community.databricks.com/t5/community-articles/simplifying-cdc-with-databricks-delta-live-tables-amp-snapshots/m-p/109411#M363</link>
      <description>&lt;P&gt;Hi Ajay&lt;/P&gt;&lt;P&gt;Can apply changes into snapshot handle re-processing of an older snapshot?&amp;nbsp;&lt;/P&gt;&lt;P&gt;UseCase:&lt;/P&gt;&lt;P&gt;- Source has delivered data on day T, T1 and T2.&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;- Consumers realise there is an error on the day T data, and make a correction in the source. The source redelivers the T data.&amp;nbsp; How will Apply changes into Snapshot handle this usecase?&amp;nbsp; Or how would you advise we handle this?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 07 Feb 2025 14:12:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/community-articles/simplifying-cdc-with-databricks-delta-live-tables-amp-snapshots/m-p/109411#M363</guid>
      <dc:creator>BilalHaniff1</dc:creator>
      <dc:date>2025-02-07T14:12:45Z</dc:date>
    </item>
  </channel>
</rss>

