cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Don't want checkpoint in delta

User16826994223
Honored Contributor III

Suppose I am not interested in checkpoints, how can I disable Checkpoints write in delta

2 REPLIES 2

User16869510359
Esteemed Contributor

Checkpoint creation in Delta is not user-controllable features/options. Although it's possible to delay the checkpoint file creation, this could have an impact on the performance of the Delta table. By default a checkpoint file creation is triggered for every 10 commits happening on the Delta table.

sajith_appukutt
Honored Contributor II

Writing statistics in a checkpoint has a cost which is visible usually only for very large tables. However it is worth mentioning that, this statistics would be very useful for data skipping which speeds up subsequent operations.

In Databricks Runtime 7.2 and below, column-level statistics are stored in Delta Lake checkpoints as a JSON column. In Databricks Runtime 7.3 LTS and above, column-level statistics are stored as a struct (struct format makes Delta Lake reads much faster)

There are two flags that control column-level statistics in checkpoints

delta.checkpoint.writeStatsAsJson & delta.checkpoint.writeStatsAsStruct If both table properties are  false, no statistics are collected or written - and readers won't be able to perform data skipping.

For more details on tradeoffs with statistics and checkpoints, see here

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.