cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Bug when re-creating force deleted external location

camilo_s
Contributor

When re-creating an external location that was previously force-deleted (because it had a soft-deleted managed table), the newly re-created external location preserves the reference to the soft-deleted managed table from the previous external location, with no possibility of purging. This will happen for all future external locations that are created with the same name and at the same physical location.

Steps to reproduce:

  1. create external location
  2. create schema on external location
  3. create table on schema
  4. drop table
  5. force-delete external location (forcing is necessary because the reference to the soft-deleted table is retained)
  6. optional: manually delete content (metadata __unitystorage) of physical external location. Optional because it makes no difference if this is done or not)
  7. create new external location with the same name at the same physical location
  8. delete the new external location: it will fail due to a reference to the dependent soft deleted table from the previous external location

I'm aware that the correct way to completely delete a managed table (including references and physical data) is longer (see e.g.  https://docs.databricks.com/en/delta/vacuum.html#purge-metadata-only-deletes-to-force-data-rewrite) but it's very unintuitive for Unity Catalog to maintain the reference to soft deleted tables past external location deletion hardcoded through external location name + physical location.

Normally (e.g. for Azure resources) soft deleted objects should still be readable so that one can still act on them (e.g. for purging or restoring). How does Databricks handle this? And equally importantly where is the metadata for this association to soft-deleted objects stored?

Does anyone know a workaround, or is that external location name burnt forever?

1 REPLY 1

mark_ott
Databricks Employee
Databricks Employee

Databricks Unity Catalog currently maintains references to soft-deleted managed tables even after the associated external location is force-deleted and re-created with the same name and physical location, causing persistent deletion failures due to lingering dependencies. This behavior is different from typical cloud resource handling (like Azure), where soft-deleted objects are explicitly recoverable, restorable, or purgable by direct user action.

How Databricks Handles Soft-Deleted References

  • When you force-delete an external location that has soft-deleted managed tables, Unity Catalog still retains metadata tying the external location name and physical path to those tables.

  • Re-creating an external location with the same name and path leads Unity Catalog to "reattach" the metadata of previously soft-deleted tables, making it impossible to purge or remove them via normal UI/commands.

  • Attempting to delete the newly created external location fails, citing dependent soft-deleted tables, even if all physical data and metadata content at the physical location has been manually cleared.

  • The association metadata is managed at the Unity Catalog service layer, not in the underlying file store, meaning deleting content at the storage does not affect these references.

Location of Metadata for Soft-Deleted Associations

  • The metadata associating external locations and soft-deleted managed tables is stored internally in Unity Catalog's backend, which uses its own persistent metadata storage beyond the control of standard file system operations or Delta Lake commands.

  • This metadata is not directly visible or editable to the user and cannot be purged by clearing the physical external location (e.g., deleting the __unitystorage folder).

  • Only certain Unity Catalog administrative APIs or support tools could theoretically access or clean up these references, but such functionality is not exposed to end users for safety and consistency reasons.

Workarounds and Permanent Consequences

  • There is no documented workaround to purge these lingering soft-deleted references when re-using the same external location name and physical location—making the external location "burnt" for future use unless Databricks enhances this logic.

  • The only feasible current solution is to always use a new external location name or a different underlying path when re-creating external locations, thereby avoiding Unity Catalog's residual associations.

  • Databricks support or engineering can sometimes manually clean up such orphaned metadata, but official channels and change requests are required.

Key Points Table

Action Result/Behavior Can Be Purged Manually? Recommended Action
Force-delete external location Soft-deleted table references persist in Unity Catalog metadata No Use a new name/path for new external location
Manually delete physical content No effect: metadata is stored at Unity Catalog layer No Contact Databricks support for metadata cleanup
Re-use name & path for location Triggers reference to lingering soft-deleted objects No Avoid reusing name/path; select new ones for future use
 
 

Additional Notes

  • Unity Catalog's retention of soft-deleted table references is meant to maintain strict data lineage and recoverability but introduces unintuitive, persistent orphaned metadata in this scenario.

  • For true deletion, you must follow full purge steps at the managed table level rather than relying on force-deletion of the external location.

  • If stuck, reach out to Databricks support for possible manual intervention during migration or cleanup.

For more details and purge instructions, Databricks publishes specific guidance in their Delta Lake vacuum and deletion documentation.