cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

dropping a managed table does not remove the underlying files

my_community2
New Contributor III

the documentation states that "drop table":

Deletes the table and removes the directory associated with the table from the file system if the table is not 

EXTERNAL  table. An exception is thrown if the table does not exist.

In case of an external table, only the associated metadata information is removed from the metastore schema.

This does not work!!

I have a managed table, see below, managed and stored on a mounted Azure storage account:

image.pngthen I execute spark.sql("drop table silver.company")

the result:

  • metadata is deleted from the catalog
  • folder is NOT deleted from disk (it should have been deleted as per docs)

1 ACCEPTED SOLUTION

Accepted Solutions

Lakshay
Esteemed Contributor

Are you using Unity Catalog? If yes, then the data will be deleted from cloud tenant in next 30 days.

https://docs.databricks.com/data-governance/unity-catalog/create-tables.html#managed-tables

View solution in original post

9 REPLIES 9

Debayan
Esteemed Contributor III

Hi, Could you please mention the document which was followed for the issue and also to configure the external metadata? Also, could you please state the steps.

my_community2
New Contributor III

docs at https://docs.microsoft.com/en-us/azure/databricks/spark/latest/spark-sql/language-manual/sql-ref-syn...

to mount a drive in databricks via the following python code

mountPoint = "/mnt/bronze"

accountName = "<provide account name>"

containerName = "bronze"

account_key = "<provide account key>"

dbutils.fs.mount(

 source = f"wasbs://{containerName}@{accountName}.blob.core.windows.net",

 mount_point = mountPoint,

 extra_configs = {f"fs.azure.account.key.{accountName}.blob.core.windows.net" : account_key} )

to create a table just CREATE ...

then drop, the directory with all files remain after dropping a table

Anonymous
Not applicable

Hi @Maciej Gโ€‹ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

alesventus
Contributor

Hi, did you solve this issue? I have the very same issue with managed tables now.

Lakshay
Esteemed Contributor

Are you using Unity Catalog? If yes, then the data will be deleted from cloud tenant in next 30 days.

https://docs.databricks.com/data-governance/unity-catalog/create-tables.html#managed-tables

karthik_p
Esteemed Contributor

@alesventus how you are creating your table, is that in unity catalog and are you using managed with external location. usually you should not specify any location for managed table . by default it gets created in hive metastore if you are not using unity catalog. please send what type of query you have used to create managed table.

if location is not used during table creation, then drop should work

Hi @karthik_p , correct answer is the one given by @Lakshay There is 30 days retention after delete of managed table.

MajdSAAD_7953
New Contributor II

Hi,

There is a way to force delete files after drop the table and don't wait 30 days to see size in S3 decrease?

Tables that I dropped related to the dev and staging, I don't want to keep there files for 30 days

 

n_joy
New Contributor III

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group