cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

Cluster logs folder

fridthoy
New Contributor II

Hi, 

 

I can't see to find the cluster_logs folder, anyone that can help me find where the cluster logs are stored? 

fridthoy_0-1729764812475.png

Best regards

7 REPLIES 7

gchandra
Databricks Employee
Databricks Employee

gchandra_0-1729767971646.png

Check your cluster, where its pointing to.



~

fridthoy
New Contributor II

fridthoy_0-1729769602276.png

It is pointing to none destination, but i can still see logs when i open my cluster

 

gchandra
Databricks Employee
Databricks Employee

If you want to save it as a files, please choose default dbfs:/cluster-logs here. 



~

fridthoy
New Contributor II

will cluster logs still be saved even if i have not specified a location? My storage cost has increased a lot without extra data in the storage, i think it is because of the logs 

filipniziol
Contributor III

HI @fridthoy ,

If you experience high storage costs, it may be worth checking if you run regularly VACUUM command on the existing delta tables.

How Does Not Performing VACUUM Affect Storage?

Delta Lake Architecture: Delta Lake maintains a transaction log (_delta_log) that tracks all changes to the tables. When you perform operations like UPDATE, DELETE, or MERGE, Delta Lake retains previous versions of data files to support features like time travel and versioning.

Accumulation of Old Files: Without regular maintenance, these old data files can accumulate, leading to increased storage usage. This is especially true for tables with frequent write operations.

In summary, I do not think cluster logs are the cause of the increased storage costs, as normally they are just a small fraction of your overall storage. I would start investigating the storage related to the existing tables that are undergoing regular refreshes.

fiff
New Contributor II

Thank you for the help! 

I have enabled predictive optimization for unity catalog, thinking it would automatically preform VACCUM on the tables i have in my delta lake. With that in mind, I assumed VACCUM wouldn't require further attention.

Would it be better to manually run VACCUM on my tables, or is predictive optimization sufficient? 

Best regards 

Hi @fiff ,

According to documentation predictive optimization is doing VACUUM automatically.
However, there are a couple of exceptions:

filipniziol_0-1729841523579.png

 



Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group