Delete Managed Table from S3 Bucket
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-14-2024 06:54 AM
Hello,
I am encountering an issue with our managed tables in Databricks. The tables are stored in S3 Bucket. When I drop a managed table (either through UI or through running a drop table code in a notebook), the associated data is not being deleted from the S3 bucket, as I would expect from a managed table. But the table is removed from the catalog section in Databricks.
Has anyone else experienced something similar?
Best,
- Labels:
-
Delta Lake
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-16-2024 08:49 AM
The S3 bucket data is deleted within 30 days according to their documentation here.
However it would be good if there was a way to force this process.
As this isn't sufficient for a case I have where i want to delete a catalog, all it's tables and it's linked external location as If I delete the external location the managed s3 deletion won't happen and the s3 data will be there forever.
So now I have failing terraform pipelines that can't delete this external location (due to having managed table data linked), and if i delete it manually with admin rights the s3 data will remain.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-16-2024 09:04 AM
@kenkoshaw, thank you for your reply. It is indeed interesting that the data isn't immediately deleted after the table is dropped, and that there's no way to force this process. I suppose I'll have to manually delete the files from the S3 Bucket if I want them gone immediately.

