cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

elgeo
by Valued Contributor II
  • 6461 Views
  • 7 replies
  • 8 kudos

Clean up _delta_log files

Hello experts. We are trying to clarify how to clean up the large amount of files that are being accumulated in the _delta_log folder (json, crc and checkpoint files). We went through the related posts in the forum and followed the below:SET spark.da...

  • 6461 Views
  • 7 replies
  • 8 kudos
Latest Reply
michaeljac1986
New Contributor II
  • 8 kudos

What you’re seeing is expected behavior — the _delta_log folder always keeps a history of JSON commit files, checkpoint files, and CRCs. Even if you lower delta.logRetentionDuration and run VACUUM, cleanup won’t happen immediately. A couple of points...

  • 8 kudos
6 More Replies
vinaykumar
by New Contributor III
  • 8300 Views
  • 6 replies
  • 0 kudos

Log files are not getting deleted automatically after logRetentionDuration internal

Hi team Log files are not getting deleted automatically after logRetentionDuration internal from delta log folder and after analysis , I see checkpoint files are not getting created after 10 commits . Below table properties using spark.sql(    f"""  ...

No checkpoint.parquet
  • 8300 Views
  • 6 replies
  • 0 kudos
Latest Reply
Anonymous
Not applicable
  • 0 kudos

Hi @vinay kumar​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

  • 0 kudos
5 More Replies
Labels