Databricks drop and remove s3 storage files safely

Abela
New Contributor III

After dropping a delta table using DROP command in databricks, is there a way to drop the s3 files in databricks without using rm command? Looking for a solution where junior developers can safely drop a table wihout messing with the rm command where they may cause accidental data loss using recursive option.

thanks

Alina.

Hubert-Dudek
Databricks MVP

official way is that before DROP:

  • Run DELETE FROM:

DELETE FROM events

  • Run VACUUM with an interval of zero:

VACUUM events RETAIN 0 HOURS

I agree that there could be some DEEP DROP 🙂

Alternatively not in SQL but in python you could write custom class/function to do that and then preinstall it on clusters so people would use some CleanTable(TableName) to make data validation and then delete+vacuum+drop+rm


My blog: https://databrickster.medium.com/

View solution in original post

jose_gonzalez
Databricks Employee
Databricks Employee

Hi @Alina Bella​ ,

Like @Hubert Dudek​ mentioned, we have a best practice guide for dropping managed tables. You can find the docs here

Hi @Alina Bella​ ,

 If @Hubert Dudek​ ''s answer solved the issue, would you be happy to mark their answer as best? That will help others find the solution more easily in the future.