<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Delta Table Optimize Error in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/delta-table-optimize-error/m-p/5732#M2068</link>
    <description>&lt;P&gt;@Dean Lovelace​&amp;nbsp;:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The error message suggests that the number of records in the Delta table changed after the optimize() command was run. The optimize() command is used to improve the performance of Delta tables by removing small files and compacting larger ones, which can improve query performance and reduce storage costs. However, if there are concurrent write operations happening while the optimize() command is running, it can cause the number of records to change, which can lead to this error.&lt;/P&gt;&lt;P&gt;To resolve this issue, you may want to consider the following steps:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Check for concurrent write operations: Check if there are any other processes or jobs that are writing to the Delta table while the optimize() command is running. If there are, you may need to temporarily stop these operations to avoid conflicts.&lt;/LI&gt;&lt;LI&gt;Retry the optimize() command: If you're sure there are no concurrent write operations, you can try running the optimize() command again to see if the issue persists. Sometimes, the error message may be due to a transient issue that is resolved when the command is retried.&lt;/LI&gt;&lt;LI&gt;Use a different optimize() configuration: You can try using a different configuration for the optimize() command, such as increasing the minFileSize or maxFileSize parameters. This may help reduce the likelihood of conflicts with concurrent write operations.&lt;/LI&gt;&lt;LI&gt;Perform a full compaction: If the issue persists, you can try running a full compaction instead of an optimized compaction. This will merge all the Delta table files into a single file, which can reduce the likelihood of conflicts with concurrent write operations. However, a full compaction can be more resource-intensive and may take longer to complete.&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Tue, 18 Apr 2023 09:06:28 GMT</pubDate>
    <dc:creator>Anonymous</dc:creator>
    <dc:date>2023-04-18T09:06:28Z</dc:date>
    <item>
      <title>Delta Table Optimize Error</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-optimize-error/m-p/5730#M2066</link>
      <description>&lt;P&gt;I have have started getting an error message when running the following optimize command:-&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;deltaTable.optimize().executeCompaction()&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Error:-&lt;/P&gt;&lt;P&gt;&lt;I&gt;java.util.concurrent.ExecutionException: java.lang.IllegalStateException: Number of records changed after Optimize. NumRecordsCheckInfo(OPTIMIZE,394,1058,2554337689,2600474509,0,0,Map(predicate -&amp;gt; "[]", zOrderBy -&amp;gt; "[]", batchId -&amp;gt; "0", auto -&amp;gt; false))&lt;/I&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;What is the cause of this? It has been running fine for months.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This is with runtime 11.3 using pyspark.&lt;/P&gt;</description>
      <pubDate>Mon, 17 Apr 2023 07:55:08 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-optimize-error/m-p/5730#M2066</guid>
      <dc:creator>Dean_Lovelace</dc:creator>
      <dc:date>2023-04-17T07:55:08Z</dc:date>
    </item>
    <item>
      <title>Re: Delta Table Optimize Error</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-optimize-error/m-p/5732#M2068</link>
      <description>&lt;P&gt;@Dean Lovelace​&amp;nbsp;:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;The error message suggests that the number of records in the Delta table changed after the optimize() command was run. The optimize() command is used to improve the performance of Delta tables by removing small files and compacting larger ones, which can improve query performance and reduce storage costs. However, if there are concurrent write operations happening while the optimize() command is running, it can cause the number of records to change, which can lead to this error.&lt;/P&gt;&lt;P&gt;To resolve this issue, you may want to consider the following steps:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;Check for concurrent write operations: Check if there are any other processes or jobs that are writing to the Delta table while the optimize() command is running. If there are, you may need to temporarily stop these operations to avoid conflicts.&lt;/LI&gt;&lt;LI&gt;Retry the optimize() command: If you're sure there are no concurrent write operations, you can try running the optimize() command again to see if the issue persists. Sometimes, the error message may be due to a transient issue that is resolved when the command is retried.&lt;/LI&gt;&lt;LI&gt;Use a different optimize() configuration: You can try using a different configuration for the optimize() command, such as increasing the minFileSize or maxFileSize parameters. This may help reduce the likelihood of conflicts with concurrent write operations.&lt;/LI&gt;&lt;LI&gt;Perform a full compaction: If the issue persists, you can try running a full compaction instead of an optimized compaction. This will merge all the Delta table files into a single file, which can reduce the likelihood of conflicts with concurrent write operations. However, a full compaction can be more resource-intensive and may take longer to complete.&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 18 Apr 2023 09:06:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-optimize-error/m-p/5732#M2068</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-04-18T09:06:28Z</dc:date>
    </item>
    <item>
      <title>Re: Delta Table Optimize Error</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-optimize-error/m-p/5733#M2069</link>
      <description>&lt;P&gt;How can I perform a full compaction?&lt;/P&gt;</description>
      <pubDate>Wed, 19 Apr 2023 14:08:04 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-optimize-error/m-p/5733#M2069</guid>
      <dc:creator>Dean_Lovelace</dc:creator>
      <dc:date>2023-04-19T14:08:04Z</dc:date>
    </item>
    <item>
      <title>Re: Delta Table Optimize Error</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-table-optimize-error/m-p/5731#M2067</link>
      <description>&lt;P&gt;Hi, looks like this has to be changed in the command. &lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.delta.io/latest/delta-utility.html" alt="https://docs.delta.io/latest/delta-utility.html" target="_blank"&gt;https://docs.delta.io/latest/delta-utility.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also, could you please recheck if &lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/349iD01F25630B70CF0B/image-size/large?v=v2&amp;amp;px=999" role="button" title="image" alt="image" /&gt;&lt;/span&gt;Reference: &lt;A href="https://docs.databricks.com/sql/language-manual/delta-optimize.html" alt="https://docs.databricks.com/sql/language-manual/delta-optimize.html" target="_blank"&gt;https://docs.databricks.com/sql/language-manual/delta-optimize.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Also, please tag&amp;nbsp;&lt;A href="https://community.databricks.com/s/profile/0053f000000WWwvAAG" alt="https://community.databricks.com/s/profile/0053f000000WWwvAAG" target="_blank"&gt;@Debayan&lt;/A&gt;​&amp;nbsp;with your next response which will notify me. Thank you!&lt;/P&gt;</description>
      <pubDate>Tue, 18 Apr 2023 06:48:46 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-table-optimize-error/m-p/5731#M2067</guid>
      <dc:creator>Debayan</dc:creator>
      <dc:date>2023-04-18T06:48:46Z</dc:date>
    </item>
  </channel>
</rss>

