<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Don't want checkpoint in delta in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/don-t-want-checkpoint-in-delta/m-p/21818#M14908</link>
    <description>&lt;P&gt;Checkpoint creation in Delta is not user-controllable features/options. Although it's possible to delay the checkpoint file creation, this could have an impact on the performance of the Delta table. By default a checkpoint file creation is triggered for every 10 commits happening on the Delta table. &lt;/P&gt;</description>
    <pubDate>Tue, 22 Jun 2021 22:57:23 GMT</pubDate>
    <dc:creator>brickster_2018</dc:creator>
    <dc:date>2021-06-22T22:57:23Z</dc:date>
    <item>
      <title>Don't want checkpoint in delta</title>
      <link>https://community.databricks.com/t5/data-engineering/don-t-want-checkpoint-in-delta/m-p/21817#M14907</link>
      <description>&lt;P&gt;Suppose I am  not interested in checkpoints, how can I disable Checkpoints write in delta &lt;/P&gt;</description>
      <pubDate>Tue, 22 Jun 2021 11:53:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/don-t-want-checkpoint-in-delta/m-p/21817#M14907</guid>
      <dc:creator>User16826994223</dc:creator>
      <dc:date>2021-06-22T11:53:45Z</dc:date>
    </item>
    <item>
      <title>Re: Don't want checkpoint in delta</title>
      <link>https://community.databricks.com/t5/data-engineering/don-t-want-checkpoint-in-delta/m-p/21818#M14908</link>
      <description>&lt;P&gt;Checkpoint creation in Delta is not user-controllable features/options. Although it's possible to delay the checkpoint file creation, this could have an impact on the performance of the Delta table. By default a checkpoint file creation is triggered for every 10 commits happening on the Delta table. &lt;/P&gt;</description>
      <pubDate>Tue, 22 Jun 2021 22:57:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/don-t-want-checkpoint-in-delta/m-p/21818#M14908</guid>
      <dc:creator>brickster_2018</dc:creator>
      <dc:date>2021-06-22T22:57:23Z</dc:date>
    </item>
    <item>
      <title>Re: Don't want checkpoint in delta</title>
      <link>https://community.databricks.com/t5/data-engineering/don-t-want-checkpoint-in-delta/m-p/21819#M14909</link>
      <description>&lt;P&gt;Writing statistics in a checkpoint has a cost&amp;nbsp;which is visible usually only for very large tables. However it is worth mentioning that, this statistics would be very useful for data skipping which speeds up subsequent operations.  &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;In Databricks Runtime 7.2 and below, column-level statistics are stored in Delta Lake checkpoints as a JSON column. In Databricks Runtime 7.3 LTS and above, column-level statistics are stored as a struct (struct format makes Delta Lake reads much faster)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;There are two flags that control column-level statistics in checkpoints&lt;/P&gt;&lt;P&gt;delta.checkpoint.writeStatsAsJson &amp;amp;  delta.checkpoint.writeStatsAsStruct If both table properties are&amp;nbsp; false, no statistics are collected or written - and readers won't be able to&amp;nbsp;perform data skipping.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;For more details on tradeoffs with statistics and checkpoints, see &lt;A href="https://docs.databricks.com/delta/optimizations/file-mgmt.html#id2" alt="https://docs.databricks.com/delta/optimizations/file-mgmt.html#id2" target="_blank"&gt;here&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 23 Jun 2021 00:13:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/don-t-want-checkpoint-in-delta/m-p/21819#M14909</guid>
      <dc:creator>sajith_appukutt</dc:creator>
      <dc:date>2021-06-23T00:13:57Z</dc:date>
    </item>
  </channel>
</rss>

