<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: MetadataChangedException in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/16833#M10935</link>
    <description>&lt;P&gt;Thanks @Hubert Dudek​&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 23 Jun 2022 19:01:22 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2022-06-23T19:01:22Z</dc:date>
    <item>
      <title>MetadataChangedException</title>
      <link>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/16827#M10929</link>
      <description>&lt;P&gt;A delta lake table is created with identity column and I'm not able to load the data parallelly from four process. i'm getting the metadata exception error.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I don't want to load the data in temp table . Need to load directly and parallelly in to delta table.&lt;/P&gt;</description>
      <pubDate>Thu, 23 Jun 2022 17:38:14 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/16827#M10929</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-06-23T17:38:14Z</dc:date>
    </item>
    <item>
      <title>Re: MetadataChangedException</title>
      <link>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/16829#M10931</link>
      <description>&lt;P&gt;MetadataChangedException: The metadata of the Delta table has been changed by a concurrent update. Please try the operation again&lt;/P&gt;</description>
      <pubDate>Thu, 23 Jun 2022 17:55:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/16829#M10931</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-06-23T17:55:04Z</dc:date>
    </item>
    <item>
      <title>Re: MetadataChangedException</title>
      <link>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/16831#M10933</link>
      <description>&lt;P&gt;No alter table operations are carried out. Just loading data from four parallelly running notebooks in to same delta lake table which is having ID as identity column is making the issue.&lt;/P&gt;&lt;P&gt;when loading the data in to temp table and putting in to target table having identity column doesn't make any issues.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But for some reason i need to load the data parallelly in to the table which is having identity column.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 23 Jun 2022 18:21:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/16831#M10933</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-06-23T18:21:29Z</dc:date>
    </item>
    <item>
      <title>Re: MetadataChangedException</title>
      <link>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/16832#M10934</link>
      <description>&lt;P&gt;@Gokul K​&amp;nbsp;, Identity is stored in table schema (which is the awful solution). That's why concurrent inserts are not supported.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I even record a video about that problem: &lt;A href="https://www.youtube.com/watch?v=BcYY_aQD0tQ" alt="https://www.youtube.com/watch?v=BcYY_aQD0tQ" target="_blank"&gt;Delta Identity Column with Databricks 10.4 - crash test - YouTube&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 23 Jun 2022 18:43:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/16832#M10934</guid>
      <dc:creator>Hubert-Dudek</dc:creator>
      <dc:date>2022-06-23T18:43:57Z</dc:date>
    </item>
    <item>
      <title>Re: MetadataChangedException</title>
      <link>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/16833#M10935</link>
      <description>&lt;P&gt;Thanks @Hubert Dudek​&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 23 Jun 2022 19:01:22 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/16833#M10935</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2022-06-23T19:01:22Z</dc:date>
    </item>
    <item>
      <title>Re: MetadataChangedException</title>
      <link>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/16835#M10937</link>
      <description>&lt;P&gt;@Hubert Dudek​&amp;nbsp;@Kaniz Fatma​&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am experiencing the same issue. Now that I understand the reason behind it, I would appreciate your assistance in finding a solution for generating a sequence for the table. Multiple concurrent jobs will be performing insertions and updates on the same table. To address the concurrent update issue, I have partitioned the table. However, I am struggling to determine the best approach for generating the Id values. I would greatly appreciate any suggestions you can provide.&lt;/P&gt;</description>
      <pubDate>Tue, 13 Jun 2023 07:45:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/16835#M10937</guid>
      <dc:creator>Databricks3</dc:creator>
      <dc:date>2023-06-13T07:45:54Z</dc:date>
    </item>
    <item>
      <title>Re: MetadataChangedException</title>
      <link>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/50675#M28861</link>
      <description>&lt;P&gt;Even in retry method or in try &amp;amp; exception method, there is no guarantee that the load of another parallel process is complete especially for large volume tables. So in such cases even if you try to repeat the write in exception, it would fail. What is best possible solution for this? Is there any other way to generate id column with auto increment method without using GENERATE clause in DDL?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Nov 2023 17:03:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/50675#M28861</guid>
      <dc:creator>Anonymous47</dc:creator>
      <dc:date>2023-11-08T17:03:59Z</dc:date>
    </item>
    <item>
      <title>Re: MetadataChangedException</title>
      <link>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/82113#M36522</link>
      <description>&lt;P&gt;I recently ran into this&amp;nbsp;&lt;SPAN&gt;MetadataChangedException. Watching the video&amp;nbsp;@Hubert Dudek​ posted it's pretty clear what is going on: object storage folks not thinking like someone who builds relational database engines built it. That's to be expected. Databricks is wonderful in many ways, but in SQL and&amp;nbsp;relational database engine features like sequences, they're evolving slowly.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;I switched to a serial write to get around the problem because of a deadline but we really should open a ticket on this with Databricks to get some clarity on an issue (parallel sequence updates) relational databases solved 50+ years ago. Like the video says, it's a bad idea to store the identity information in schema. Needs to be a something like a separate file with a thread safe approach to updates.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 07 Aug 2024 04:09:21 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/82113#M36522</guid>
      <dc:creator>seans</dc:creator>
      <dc:date>2024-08-07T04:09:21Z</dc:date>
    </item>
    <item>
      <title>Re: MetadataChangedException</title>
      <link>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/101144#M40559</link>
      <description>&lt;P&gt;I'm having the same issue, need to load a large amount of data from separate files into a delta table and I want to do it with a for each loop so I don't have to run it sequentially which will take days. There should be a way to handle this&amp;nbsp;&lt;span class="lia-unicode-emoji" title=":upside_down_face:"&gt;🙃&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 05 Dec 2024 22:43:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/101144#M40559</guid>
      <dc:creator>cpc0707</dc:creator>
      <dc:date>2024-12-05T22:43:30Z</dc:date>
    </item>
    <item>
      <title>Re: MetadataChangedException</title>
      <link>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/142897#M52044</link>
      <description>&lt;P&gt;I'm also having the same problem.&amp;nbsp; &amp;nbsp;I'm using autoloader to load many files into a delta table with an identity column.&amp;nbsp; What used to work now dies with this problem -- after running for a long time!!&lt;/P&gt;</description>
      <pubDate>Sat, 03 Jan 2026 12:34:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/metadatachangedexception/m-p/142897#M52044</guid>
      <dc:creator>lprevost</dc:creator>
      <dc:date>2026-01-03T12:34:23Z</dc:date>
    </item>
  </channel>
</rss>

