cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Table Trigger - Too many logfiles

184754
New Contributor II

Hi, 

we have implemented a job that runs on a trigger of a table update. The job worked perfectly, until the source table now have accumulated too many log files and the job isn't running anymore. Only the error message below:

Storage location /abcd/_delta_log/ contains more than the allowed limit of 9999 objects. Remove objects or choose a different location for your table trigger.

We have implemented a retention policy (delta.logRetentionDuration=interval 7 days) for the source table to hopefully correct the issue but it seems it does nothing. Checking we still see log files from several months.

Moreover, each update of the source table seems to produce 8 log files, and the table currently is update each hour. This gives nearly 200 new log files each day, and if we want to update the source table more often we will quickly run into the 9999 file limit.

Are there any suggestions how handle the log file retention? If the logRetentionduration doesn't work, seemingly we can't rely on the TableTrigger function. Can we manually go and delete the files, or implement some s3 retention policy on our own or will this break some logic?

Thanks for any suggestions.

2 REPLIES 2

radothede
Contributor II

Hi @184754 

Interesting topic, as the docs says:

"Log files are deleted automatically and asynchronously after checkpoint operations and are not governed by VACUUM. While the default retention period of log files is 30 days, running VACUUM on a table removes the data files necessary for time travel."

anyways, I guess You've already tried running the vacuum command.

184754
New Contributor II

Hi, thanks for the reply. Yeah, the table is vacuumed once per day, but it doesn't seem to remove any of the log files. I'll add the extra information to the post as well.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group