<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Need a Sample MERGE INTO Query for SCD Type 2 Implementation in Warehousing &amp; Analytics</title>
    <link>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/104311#M1794</link>
    <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/120641"&gt;@David_Torrejon&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Doesn't you example code perform SCD Type 1 rather than Type 2?&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;whenMatchedUpdate() updates an existing record.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;whenNotMatchedInsert() inserts new records.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;whenNotMatchedBySourceUpdate() updates records not available in the source&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;In SCD Type 2, when an old record is updated, a corresponding new row needs to be inserted with&amp;nbsp;is_current as 'true'. Where is this happening?&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 06 Jan 2025 11:21:46 GMT</pubDate>
    <dc:creator>Prabhuram</dc:creator>
    <dc:date>2025-01-06T11:21:46Z</dc:date>
    <item>
      <title>Need a Sample MERGE INTO Query for SCD Type 2 Implementation</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/98909#M1692</link>
      <description>&lt;P&gt;Can anyone provide a sample MERGE INTO SQL query for implementing SCD Type 2 in Databricks using Delta Tables?&lt;/P&gt;</description>
      <pubDate>Fri, 15 Nov 2024 10:58:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/98909#M1692</guid>
      <dc:creator>Akshay_Petkar</dc:creator>
      <dc:date>2024-11-15T10:58:43Z</dc:date>
    </item>
    <item>
      <title>Re: Need a Sample MERGE INTO Query for SCD Type 2 Implementation</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/98912#M1693</link>
      <description>&lt;DIV&gt;Here an example for a customer table.&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;The source_table contains new or updated customer data, and the target_table is the Delta table that maintains historical records.&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;Table Structures&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;source_table: contains the latest customer data.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;customer_id: Unique identifier for the customer.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;name: Customer's name.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;address: Customer's address.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;email: Customer's email.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;phone: Customer's phone number.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;target_table: contains the historical customer data.&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;customer_id: Unique identifier for the customer.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;name: Customer's name.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;address: Customer's address.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;email: Customer's email.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;phone: Customer's phone number.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;valid_from: Date when the record became effective.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;valid_to: Date until the record is effective.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;is_current: Flag indicating the current active record.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;hash_value: Hash of the attributes to detect changes.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;WITH source_with_hash AS (&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; SELECT&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; customer_id,&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; name,&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; address,&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; email,&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; phone,&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; md5(concat_ws('|', name, address, email, phone)) AS hash_value&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; FROM source_table&lt;/DIV&gt;&lt;DIV&gt;)&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;MERGE INTO target_table AS target&lt;/DIV&gt;&lt;DIV&gt;USING source_with_hash AS source&lt;/DIV&gt;&lt;DIV&gt;ON target.customer_id = source.customer_id&lt;/DIV&gt;&lt;DIV&gt;AND target.is_current = true&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;WHEN MATCHED AND target.hash_value != source.hash_value THEN&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; UPDATE SET&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; target.valid_to = current_date - 1,&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; target.is_current = false&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;WHEN NOT MATCHED BY TARGET THEN&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; INSERT (customer_id, name, address, email, phone, valid_from, valid_to, is_current, hash_value)&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; VALUES (source.customer_id, source.name, source.address, source.email, source.phone, current_date, '9999-12-31', true, source.hash_value)&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;WHEN NOT MATCHED BY SOURCE AND target.is_current = true THEN&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; UPDATE SET&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; target.valid_to = current_date - 1,&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp; &amp;nbsp; target.is_current = false;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;Here the explanation about all parts of the sentence.&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;WITH Clause:&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Creates a subquery source_with_hash that adds a hash_value column to the source_table. This column contains an MD5 hash of the relevant attributes to detect changes.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;MATCHED Clause:&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Handles updates where there are changes in the source data (source.hash_value is different from target.hash_value).&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Updates the valid_to date of the current record in the target table to the previous day and sets is_current to false.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;NOT MATCHED BY TARGET Clause:&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Inserts new records that do not exist in the target table.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Inserts the new records with valid_from set to the current date, valid_to set to '9999-12-31', and is_current set to true.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;NOT MATCHED BY SOURCE Clause:&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Handles records that are in the target table but not in the source table (optional, if you want to handle deletions).&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Updates the valid_to date to the previous day and sets is_current to false.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;You only have to adjust the column names and logic according to your specific schema and requirements.&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;I hope it helps you.&lt;/DIV&gt;</description>
      <pubDate>Fri, 15 Nov 2024 11:42:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/98912#M1693</guid>
      <dc:creator>David_Torrejon</dc:creator>
      <dc:date>2024-11-15T11:42:10Z</dc:date>
    </item>
    <item>
      <title>Re: Need a Sample MERGE INTO Query for SCD Type 2 Implementation</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/98916#M1694</link>
      <description>&lt;P&gt;also, in PySpark, the same example in pyspark:&lt;/P&gt;&lt;P&gt;from pyspark.sql.functions import col, concat_ws, current_date, lit, md5&lt;/P&gt;&lt;P&gt;source_df = spark.table("source_table")&lt;BR /&gt;target_df = spark.table("target_table")&lt;/P&gt;&lt;P&gt;source_with_hash_df = source_df.withColumn("hash_value", md5(concat_ws("|", col("name"), col("address"),&amp;nbsp; col("email"), col("phone"))))&lt;/P&gt;&lt;P&gt;target_df.alias("target").merge(&lt;BR /&gt;source_with_hash_df.alias("source"),&lt;BR /&gt;"target.customer_id = source.customer_id AND target.is_current = true"&lt;BR /&gt;).whenMatchedUpdate(&lt;BR /&gt;condition="target.hash_value != source.hash_value",&lt;BR /&gt;set={&lt;BR /&gt;"valid_to": current_date() - 1,&lt;BR /&gt;"is_current": lit(False)&lt;BR /&gt;}&lt;BR /&gt;).whenNotMatchedInsert(&lt;BR /&gt;values={&lt;BR /&gt;"customer_id": col("source.customer_id"),&lt;BR /&gt;"name": col("source.name"),&lt;BR /&gt;"address": col("source.address"),&lt;BR /&gt;"email": col("source.email"),&lt;BR /&gt;"phone": col("source.phone"),&lt;BR /&gt;"valid_from": current_date(),&lt;BR /&gt;"valid_to": lit("9999-12-31"),&lt;BR /&gt;"is_current": lit(True),&lt;BR /&gt;"hash_value": col("source.hash_value")&lt;BR /&gt;}&lt;BR /&gt;).whenNotMatchedBySourceUpdate(&lt;BR /&gt;condition="target.is_current = true",&lt;BR /&gt;set={&lt;BR /&gt;"valid_to": current_date() - 1,&lt;BR /&gt;"is_current": lit(False)&lt;BR /&gt;}&lt;BR /&gt;)&lt;/P&gt;&lt;P&gt;You have to add an action to execute.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 15 Nov 2024 11:57:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/98916#M1694</guid>
      <dc:creator>David_Torrejon</dc:creator>
      <dc:date>2024-11-15T11:57:19Z</dc:date>
    </item>
    <item>
      <title>Re: Need a Sample MERGE INTO Query for SCD Type 2 Implementation</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/100025#M1714</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/88335"&gt;@Akshay_Petkar&lt;/a&gt;&amp;nbsp;, please refer this code ,&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;df &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; spark.read.&lt;/SPAN&gt;&lt;SPAN&gt;format&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"delta"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;load&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;f&lt;/SPAN&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;SPAN&gt;bronze_folder_path&lt;/SPAN&gt;&lt;SPAN&gt;}&lt;/SPAN&gt;&lt;SPAN&gt;/Test_new"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Table Structure&amp;nbsp;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;%sql&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;CREATE&lt;/SPAN&gt; &lt;SPAN&gt;TABLE&lt;/SPAN&gt; &lt;SPAN&gt;IF&lt;/SPAN&gt; &lt;SPAN&gt;NOT&lt;/SPAN&gt; &lt;SPAN&gt;EXISTS&lt;/SPAN&gt; &lt;SPAN&gt;test_project_ws&lt;/SPAN&gt;&lt;SPAN&gt;.&lt;/SPAN&gt;&lt;SPAN&gt;demo&lt;/SPAN&gt;&lt;SPAN&gt;.Test_merge (&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;ID&lt;/SPAN&gt; &lt;SPAN&gt;INT&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;Name&lt;/SPAN&gt; &lt;SPAN&gt;STRING&lt;/SPAN&gt;&lt;SPAN&gt; ,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;Address &lt;/SPAN&gt;&lt;SPAN&gt;STRING&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;date&lt;/SPAN&gt; &lt;SPAN&gt;DATE&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp;createdDate &lt;/SPAN&gt;&lt;SPAN&gt;TIMESTAMP&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp;updatedDate &lt;/SPAN&gt;&lt;SPAN&gt;TIMESTAMP&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;USING&lt;/SPAN&gt; &lt;SPAN&gt;DELTA&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;from&lt;/SPAN&gt;&lt;SPAN&gt; pyspark.sql.functions &lt;/SPAN&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; current_timestamp&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;from&lt;/SPAN&gt;&lt;SPAN&gt; delta.tables &lt;/SPAN&gt;&lt;SPAN&gt;import&lt;/SPAN&gt;&lt;SPAN&gt; DeltaTable&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;table_name &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;"test_project_ws.demo.Test_merge"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;deltaTable &lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt; DeltaTable.&lt;/SPAN&gt;&lt;SPAN&gt;forName&lt;/SPAN&gt;&lt;SPAN&gt;(spark, table_name)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;SPAN&gt;deltaTable.&lt;/SPAN&gt;&lt;SPAN&gt;alias&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"tgt"&lt;/SPAN&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;merge&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; bronze_df.&lt;/SPAN&gt;&lt;SPAN&gt;alias&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"upd"&lt;/SPAN&gt;&lt;SPAN&gt;),&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"tgt.Id = upd.Id"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;whenMatchedUpdate&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"Id"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;"upd.Id"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"Name"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;"upd.Name"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"Address"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;"upd.Address"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"date"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;"upd.date"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"updatedDate"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;"current_timestamp()"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; }&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;whenNotMatchedInsert&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;values&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;{&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"Id"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;"upd.Id"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"Name"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;"upd.Name"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"Address"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;"upd.Address"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"date"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;"upd.date"&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;"createdDate"&lt;/SPAN&gt;&lt;SPAN&gt;: &lt;/SPAN&gt;&lt;SPAN&gt;"current_timestamp()"&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; }&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;).&lt;/SPAN&gt;&lt;SPAN&gt;execute&lt;/SPAN&gt;&lt;SPAN&gt;()&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 26 Nov 2024 06:16:13 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/100025#M1714</guid>
      <dc:creator>JissMathew</dc:creator>
      <dc:date>2024-11-26T06:16:13Z</dc:date>
    </item>
    <item>
      <title>Re: Need a Sample MERGE INTO Query for SCD Type 2 Implementation</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/100038#M1715</link>
      <description>&lt;P&gt;Is there any limitation to the length of the string passed to&amp;nbsp;&lt;SPAN&gt;md5 function when concatenating multiple columns to generate&amp;nbsp;hash_value field ?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Nov 2024 08:36:38 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/100038#M1715</guid>
      <dc:creator>dbarua</dc:creator>
      <dc:date>2024-11-26T08:36:38Z</dc:date>
    </item>
    <item>
      <title>Re: Need a Sample MERGE INTO Query for SCD Type 2 Implementation</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/100319#M1719</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/132169"&gt;@JissMathew&lt;/a&gt;&amp;nbsp; and&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/120641"&gt;@David_Torrejon&lt;/a&gt;&amp;nbsp;, Thanks for sharing the example&lt;/P&gt;</description>
      <pubDate>Thu, 28 Nov 2024 10:51:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/100319#M1719</guid>
      <dc:creator>bhanu_gautam</dc:creator>
      <dc:date>2024-11-28T10:51:22Z</dc:date>
    </item>
    <item>
      <title>Re: Need a Sample MERGE INTO Query for SCD Type 2 Implementation</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/104311#M1794</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/120641"&gt;@David_Torrejon&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Doesn't you example code perform SCD Type 1 rather than Type 2?&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;whenMatchedUpdate() updates an existing record.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;whenNotMatchedInsert() inserts new records.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;whenNotMatchedBySourceUpdate() updates records not available in the source&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;In SCD Type 2, when an old record is updated, a corresponding new row needs to be inserted with&amp;nbsp;is_current as 'true'. Where is this happening?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 06 Jan 2025 11:21:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/104311#M1794</guid>
      <dc:creator>Prabhuram</dc:creator>
      <dc:date>2025-01-06T11:21:46Z</dc:date>
    </item>
    <item>
      <title>Re: Need a Sample MERGE INTO Query for SCD Type 2 Implementation</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/104480#M1798</link>
      <description>&lt;P&gt;Yep, I was thinking the same. The only way I know is to have a seperated INSERT INTO command before the MERGE INTO.&lt;/P&gt;&lt;LI-CODE lang="python"&gt;INSERT INTO target_table (
     columns,
     effectiveStartDate,
     effectiveEndDate,
     isCurrent,
     version
)
SELECT
     new.columns,
     DATE(new.timestamp),
     DATE('9999-12-31'),
     TRUE,
     target.version + 1
FROM df as new
LEFT JOIN destination_table as target
ON new.customerId = target.customerId and target.isCurrent
WHERE (
  target.column &amp;lt;&amp;gt; new.column
  OR target.column &amp;lt;&amp;gt; new.column
)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 07 Jan 2025 10:16:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/104480#M1798</guid>
      <dc:creator>svrijssel</dc:creator>
      <dc:date>2025-01-07T10:16:25Z</dc:date>
    </item>
    <item>
      <title>Re: Need a Sample MERGE INTO Query for SCD Type 2 Implementation</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/136299#M2300</link>
      <description>&lt;P&gt;Same Concern where is this happening do we have any other example where its handling it correctly by maintaining history&lt;/P&gt;</description>
      <pubDate>Tue, 28 Oct 2025 02:35:25 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/136299#M2300</guid>
      <dc:creator>jbhavesh</dc:creator>
      <dc:date>2025-10-28T02:35:25Z</dc:date>
    </item>
    <item>
      <title>Re: Need a Sample MERGE INTO Query for SCD Type 2 Implementation</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/136667#M2304</link>
      <description>&lt;DIV&gt;&lt;SPAN&gt;Here is a simple example using an upstream Delta table with ChangeDataFeed enabled, using table_changes() to get the records with their corresponding operation, this is a 2 step process&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;SPAN&gt;you need to close out modified or deleted records&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;add new rows (inserted at the source)&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/OL&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;PRE&gt;&lt;SPAN&gt;-- Step 1: Close out records that changed (updates and deletes)&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;MERGE&lt;/SPAN&gt; &lt;SPAN&gt;INTO&lt;/SPAN&gt; &lt;SPAN&gt;west_division&lt;/SPAN&gt;.&lt;SPAN&gt;retail_data&lt;/SPAN&gt;.&lt;WBR /&gt;customers_type2 &lt;SPAN&gt;AS&lt;/SPAN&gt; &lt;SPAN&gt;target&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;USING&lt;/SPAN&gt; (&lt;BR /&gt;&amp;nbsp; &lt;SPAN&gt;SELECT DISTINCT&lt;/SPAN&gt; customer_id, _commit_timestamp&lt;BR /&gt;&amp;nbsp; &lt;SPAN&gt;FROM&lt;/SPAN&gt; table_changes(&lt;SPAN&gt;'east_division_&lt;WBR /&gt;shared.retail.customers'&lt;/SPAN&gt;, &lt;SPAN&gt;2&lt;/SPAN&gt;, &lt;SPAN&gt;5&lt;/SPAN&gt;)&lt;BR /&gt;&amp;nbsp; &lt;SPAN&gt;WHERE&lt;/SPAN&gt; _change_type &lt;SPAN&gt;IN&lt;/SPAN&gt; (&lt;SPAN&gt;'update_postimage'&lt;/SPAN&gt;, &lt;SPAN&gt;'delete'&lt;/SPAN&gt;)&lt;BR /&gt;&amp;nbsp; &lt;SPAN&gt;ORDER BY&lt;/SPAN&gt; _commit_timestamp&lt;BR /&gt;) &lt;SPAN&gt;AS&lt;/SPAN&gt; &lt;SPAN&gt;source&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;ON&lt;/SPAN&gt; &lt;SPAN&gt;target&lt;/SPAN&gt;.&lt;SPAN&gt;customer_id&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;source&lt;/SPAN&gt;.&lt;SPAN&gt;customer_id&lt;/SPAN&gt; &lt;SPAN&gt;AND&lt;/SPAN&gt; &lt;SPAN&gt;target&lt;/SPAN&gt;.&lt;SPAN&gt;is_current&lt;/SPAN&gt; &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;true&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;WHEN&lt;/SPAN&gt; &lt;SPAN&gt;MATCHED&lt;/SPAN&gt; &lt;SPAN&gt;THEN&lt;/SPAN&gt;&lt;BR /&gt;&amp;nbsp; &lt;SPAN&gt;UPDATE&lt;/SPAN&gt; &lt;SPAN&gt;SET&lt;/SPAN&gt;&lt;BR /&gt;&amp;nbsp; &amp;nbsp; end_date &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;source&lt;/SPAN&gt;.&lt;SPAN&gt;_commit_timestamp&lt;/SPAN&gt;,&lt;BR /&gt;&amp;nbsp; &amp;nbsp; is_current &lt;SPAN&gt;=&lt;/SPAN&gt; &lt;SPAN&gt;false&lt;/SPAN&gt;;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;-- Step 2: Insert new versions (inserts and updates)&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;INSERT INTO&lt;/SPAN&gt; &lt;SPAN&gt;west_division&lt;/SPAN&gt;.&lt;SPAN&gt;retail_data&lt;/SPAN&gt;.&lt;WBR /&gt;customers_type2&lt;BR /&gt;&lt;SPAN&gt;SELECT&lt;/SPAN&gt;&lt;BR /&gt;&amp;nbsp; customer_id, customer_name, email, country, signup_date, customer_segment,&lt;BR /&gt;&amp;nbsp; _commit_timestamp &lt;SPAN&gt;as&lt;/SPAN&gt; start_date,&lt;BR /&gt;&amp;nbsp; &lt;SPAN&gt;NULL&lt;/SPAN&gt; &lt;SPAN&gt;as&lt;/SPAN&gt; end_date,&lt;BR /&gt;&amp;nbsp; &lt;SPAN&gt;true&lt;/SPAN&gt; &lt;SPAN&gt;as&lt;/SPAN&gt; is_current&lt;BR /&gt;&lt;SPAN&gt;FROM&lt;/SPAN&gt; table_changes(&lt;SPAN&gt;'east_division_&lt;WBR /&gt;shared.retail.customers'&lt;/SPAN&gt;, &lt;SPAN&gt;2&lt;/SPAN&gt;, &lt;SPAN&gt;5&lt;/SPAN&gt;)&lt;BR /&gt;&lt;SPAN&gt;WHERE&lt;/SPAN&gt; _change_type &lt;SPAN&gt;IN&lt;/SPAN&gt; (&lt;SPAN&gt;'insert'&lt;/SPAN&gt;, &lt;SPAN&gt;'update_postimage'&lt;/SPAN&gt;)&lt;BR /&gt;&lt;SPAN&gt;ORDER BY&lt;/SPAN&gt; _commit_timestamp;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 30 Oct 2025 03:07:31 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/need-a-sample-merge-into-query-for-scd-type-2-implementation/m-p/136667#M2304</guid>
      <dc:creator>jeffreyaven</dc:creator>
      <dc:date>2025-10-30T03:07:31Z</dc:date>
    </item>
  </channel>
</rss>

