<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Auto loader from tables in Delta Share in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/auto-loader-from-tables-in-delta-share/m-p/117306#M45475</link>
    <description>&lt;P&gt;Checking.&lt;/P&gt;</description>
    <pubDate>Thu, 01 May 2025 06:59:44 GMT</pubDate>
    <dc:creator>NandiniN</dc:creator>
    <dc:date>2025-05-01T06:59:44Z</dc:date>
    <item>
      <title>Auto loader from tables in Delta Share</title>
      <link>https://community.databricks.com/t5/data-engineering/auto-loader-from-tables-in-delta-share/m-p/109332#M43277</link>
      <description>&lt;P&gt;Hello,&lt;BR /&gt;&lt;BR /&gt;I am trying to read delta table in delta shares shared from other environments.&lt;BR /&gt;&lt;BR /&gt;The pipeline runs okay; however, as the delta table is update in the source (delta share in GCP), the code below gets error, unless if I reset the checkpoint. I wonder if reading delta tables in delta share, I can keep the checkpoint to avoid the same data being write twice if pipeline is executed twice.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; streaming_transactions = spark.readStream.format("delta") \&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .option("cloudFiles.format", "deltaSharing") \&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .table(f"{source_root_path}.{table_name}") \&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .selectExpr("*", *metadata) &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/EM&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; # 'mergeSchema' option enables schema evolution when writing&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; # 'readChangeFeed' option tells Delta Lake to read the change data from the Delta table, rather than the full data.&lt;/EM&gt;&lt;/DIV&gt;&lt;BR /&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; streaming_transactions.writeStream.format("delta") \&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .partitionBy(f"retrieved_datetime") \&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .trigger(availableNow=True) \&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .option("checkpointLocation", checkpoint) \&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .option("readChangeFeed", "true") \&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .option("mergeSchema", "true") \&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; .toTable(&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; tableName=target_table_name,&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; format="delta",&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; outputMode="append",&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; path=target_path&lt;/EM&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; )&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;com.databricks.sql.transaction.tahoe.DeltaUnsupportedOperationException: [DELTA_SOURCE_TABLE_IGNORE_CHANGES] Detected a data update (for example CREATE OR REPLACE TABLE AS SELECT (Map(partitionBy -&amp;gt; [], clusterBy -&amp;gt; [], description -&amp;gt; null, isManaged -&amp;gt; true, properties -&amp;gt; {"delta.enableDeletionVectors":"true"}, statsOnLoad -&amp;gt; false))) in the source table at version 8. This is currently not supported. If this is going to happen regularly and you are okay to skip changes, set the option 'skipChangeCommits' to 'true'. If you would like the data update to be reflected, please restart this query with a fresh checkpoint directory or do a full refresh if you are using DLT. If you need to handle these changes, please switch to MVs. The source table can be found at path gs://databricks.....&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Fri, 07 Feb 2025 01:24:50 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/auto-loader-from-tables-in-delta-share/m-p/109332#M43277</guid>
      <dc:creator>dbuenosilva</dc:creator>
      <dc:date>2025-02-07T01:24:50Z</dc:date>
    </item>
    <item>
      <title>Re: Auto loader from tables in Delta Share</title>
      <link>https://community.databricks.com/t5/data-engineering/auto-loader-from-tables-in-delta-share/m-p/117306#M45475</link>
      <description>&lt;P&gt;Checking.&lt;/P&gt;</description>
      <pubDate>Thu, 01 May 2025 06:59:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/auto-loader-from-tables-in-delta-share/m-p/117306#M45475</guid>
      <dc:creator>NandiniN</dc:creator>
      <dc:date>2025-05-01T06:59:44Z</dc:date>
    </item>
    <item>
      <title>Re: Auto loader from tables in Delta Share</title>
      <link>https://community.databricks.com/t5/data-engineering/auto-loader-from-tables-in-delta-share/m-p/117367#M45485</link>
      <description>&lt;P class="_1t7bu9h1 paragraph"&gt;The error you are encountering—&lt;CODE&gt;DeltaUnsupportedOperationException: [DELTA_SOURCE_TABLE_IGNORE_CHANGES]&lt;/CODE&gt;—occurs because your streaming job detects updates in the source Delta table, which is not supported for they type of source you have. Streaming table is append only.&lt;/P&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;This error is triggered when the source table undergoes data updates (such as &lt;CODE&gt;CREATE OR REPLACE&lt;/CODE&gt; or &lt;CODE&gt;UPDATE&lt;/CODE&gt;), and the streaming process doesn't know how to handle them.&lt;/P&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;As mentioned in the error message this operation is not supported, and you can either skip these operations if you do not want the stream to be affected by these actions using .option("skipChangeCommits", "true"). Note - By enabling &lt;CODE&gt;skipChangeCommits&lt;/CODE&gt;, you might miss changes made to existing records in the source table. Downstream systems should be designed to handle such cases if necessary&lt;/P&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;Another suggestion: If data updates in the source table are regular and need to bepropagated downstream, converting the source table access pattern to use Materialized Views is recommended. This ensures updates are handled flexibly, and the downstream system can efficiently process changes.&lt;/P&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="_1t7bu9h1 paragraph"&gt;You can check&amp;nbsp;&lt;SPAN class="hljs-keyword"&gt;DESCRIBE&lt;/SPAN&gt; HISTORY delta.`&lt;SPAN class="hljs-operator"&gt;&amp;lt;&lt;/SPAN&gt;table_path&lt;SPAN class="hljs-operator"&gt;&amp;gt;&lt;/SPAN&gt;` to check the operation at version 8 in source table to understand further.&lt;/P&gt;
&lt;H2 class="_1jeaq5e0 _1t7bu9h9 heading2"&gt;&amp;nbsp;&lt;/H2&gt;</description>
      <pubDate>Thu, 01 May 2025 11:55:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/auto-loader-from-tables-in-delta-share/m-p/117367#M45485</guid>
      <dc:creator>NandiniN</dc:creator>
      <dc:date>2025-05-01T11:55:18Z</dc:date>
    </item>
  </channel>
</rss>

