<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Delta Log checkpoints not being created? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/delta-log-checkpoints-not-being-created/m-p/37069#M26255</link>
    <description>&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As the latest update now checkpointing of delta tables are created for every 100 commits.&amp;nbsp;This is done for some improvement purpose.&lt;/P&gt;&lt;P&gt;If you want to have a checkpoint file for delta table for every 10 commits or after any desired commits. You can customize it using below configuration:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;"delta.checkpointInterval"&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Syntax:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;alter table &amp;lt;table_name&amp;gt; SET TBLPROPERTIES ("delta.checkpointInterval" = "10")&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;you can set the desired checkpointing interval and you need&amp;nbsp;to alter the table with this and set the checkpoint interval under set table properties&amp;nbsp;shown as above&amp;nbsp;&lt;STRONG&gt;"delta.checkpointInterval" = "10"&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
    <pubDate>Thu, 06 Jul 2023 09:21:17 GMT</pubDate>
    <dc:creator>Vinay_M_R</dc:creator>
    <dc:date>2023-07-06T09:21:17Z</dc:date>
    <item>
      <title>Delta Log checkpoints not being created?</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-log-checkpoints-not-being-created/m-p/37042#M26247</link>
      <description>&lt;P&gt;It is mentioned in the &lt;A href="https://github.com/delta-io/delta/blob/master/PROTOCOL.md#checkpoints" target="_self"&gt;delta protocol&lt;/A&gt; that checkpoints for delta tables are created every 10 commits - however when I modify a table after &amp;gt;10 separate operations (producing &amp;gt;10 separate json files in the _delta_log directory), no checkpoint files are created. &lt;STRONG&gt;Are there specific conditions under which checkpoint files are created (and not just every 10 commits); i.e. certain operations, data size, etc.? &lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;My concern is that if checkpoints aren't created, then delta logs aren't cleaned up, and if that happens does that mean the metadata for my tables will grow infinitely large over time?&lt;/P&gt;&lt;P&gt;The delta tables I created were done by executing (where the storage location is s3):&lt;/P&gt;&lt;P&gt;```&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;df.write.&lt;/SPAN&gt;&lt;SPAN&gt;format&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"delta"&lt;/SPAN&gt;&lt;SPAN&gt;).saveAsTable(name="&amp;lt;table&amp;gt;"&lt;/SPAN&gt;&lt;SPAN&gt;, path=&lt;/SPAN&gt;&lt;SPAN&gt;"&amp;lt;s3_path&amp;gt;"&lt;/SPAN&gt;&lt;SPAN&gt;, mode=&lt;/SPAN&gt;&lt;SPAN&gt;"overwrite"&lt;/SPAN&gt;&lt;SPAN&gt;, overwriteSchema=&lt;/SPAN&gt;&lt;SPAN&gt;True&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;```&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 06 Jul 2023 01:41:07 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-log-checkpoints-not-being-created/m-p/37042#M26247</guid>
      <dc:creator>442027</dc:creator>
      <dc:date>2023-07-06T01:41:07Z</dc:date>
    </item>
    <item>
      <title>Re: Delta Log checkpoints not being created?</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-log-checkpoints-not-being-created/m-p/37069#M26255</link>
      <description>&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;DIV class=""&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As the latest update now checkpointing of delta tables are created for every 100 commits.&amp;nbsp;This is done for some improvement purpose.&lt;/P&gt;&lt;P&gt;If you want to have a checkpoint file for delta table for every 10 commits or after any desired commits. You can customize it using below configuration:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;"delta.checkpointInterval"&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Syntax:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;alter table &amp;lt;table_name&amp;gt; SET TBLPROPERTIES ("delta.checkpointInterval" = "10")&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;you can set the desired checkpointing interval and you need&amp;nbsp;to alter the table with this and set the checkpoint interval under set table properties&amp;nbsp;shown as above&amp;nbsp;&lt;STRONG&gt;"delta.checkpointInterval" = "10"&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Thu, 06 Jul 2023 09:21:17 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-log-checkpoints-not-being-created/m-p/37069#M26255</guid>
      <dc:creator>Vinay_M_R</dc:creator>
      <dc:date>2023-07-06T09:21:17Z</dc:date>
    </item>
    <item>
      <title>Re: Delta Log checkpoints not being created?</title>
      <link>https://community.databricks.com/t5/data-engineering/delta-log-checkpoints-not-being-created/m-p/37111#M26258</link>
      <description>&lt;P&gt;Tested and confirmed that it's every 100 commits by default. Thanks that makes a lot of sense!&lt;/P&gt;</description>
      <pubDate>Thu, 06 Jul 2023 19:04:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/delta-log-checkpoints-not-being-created/m-p/37111#M26258</guid>
      <dc:creator>442027</dc:creator>
      <dc:date>2023-07-06T19:04:57Z</dc:date>
    </item>
  </channel>
</rss>

