When re-creating an external location that was previously force-deleted (because it had a soft-deleted managed table), the newly re-created external location preserves the reference to the soft-deleted managed table from the previous external location, with no possibility of purging. This will happen for all future external locations that are created with the same name and at the same physical location.
Steps to reproduce:
- create external location
- create schema on external location
- create table on schema
- drop table
- force-delete external location (forcing is necessary because the reference to the soft-deleted table is retained)
- optional: manually delete content (metadata __unitystorage) of physical external location. Optional because it makes no difference if this is done or not)
- create new external location with the same name at the same physical location
- delete the new external location: it will fail due to a reference to the dependent soft deleted table from the previous external location
I'm aware that the correct way to completely delete a managed table (including references and physical data) is longer (see e.g. https://docs.databricks.com/en/delta/vacuum.html#purge-metadata-only-deletes-to-force-data-rewrite) but it's very unintuitive for Unity Catalog to maintain the reference to soft deleted tables past external location deletion hardcoded through external location name + physical location.
Normally (e.g. for Azure resources) soft deleted objects should still be readable so that one can still act on them (e.g. for purging or restoring). How does Databricks handle this? And equally importantly where is the metadata for this association to soft-deleted objects stored?
Does anyone know a workaround, or is that external location name burnt forever?