โ07-13-2023 01:25 AM - edited โ07-13-2023 01:27 AM
Managed tables are stored under the /user/hive/warehouse, which is also mentioned in the documentation.
In our workflow, we use that path to read the parquet files from outside (through databricks connector). Can we assume this path is reliable, or is it an "implementation detail" that might change at any time?
โ07-14-2023 01:09 AM
While the current documentation mentions that managed tables are stored under the /user/hive/warehouse
path, let's stick to it until any official announcements come up for any updates or changes to implementation details.
โ07-13-2023 02:03 AM - edited โ07-13-2023 02:04 AM
Hi @giohappy, The path /user/hive/warehouse
is commonly used as the default location for managed tables, according to the documentation.
โ07-13-2023 02:55 AM
Yes, that link was also mentioned in my question. The point is if our pipeline can always assume that the path is where the parquet files for the managed tables are expected to be, or it's just an internal detail that could change at any time.
โ07-13-2023 04:18 AM
Hi @giohappy, By default, managed tables are stored in the root storage location you configure when creating a metastore. Optionally specify managed table storage locations at the catalog or schema levels, overriding the root storage location. Managed tables always use the Delta table format.
When a managed table is dropped, its underlying data is deleted from your cloud tenant within 30 days.
See Managed tables.
โ07-14-2023 01:01 AM
In our case we haven't configured or created the metastore directly. We're relying on the default metastore, which is where the tables are written when we do:
df.write.format("delta").mode("overwrite").saveAsTable(output_table_name)
I haven't found anything saying that the path of the default metastore might change unexpectedly. By the way, I don't even found something stating the opposite ๐
โ07-14-2023 01:09 AM
While the current documentation mentions that managed tables are stored under the /user/hive/warehouse
path, let's stick to it until any official announcements come up for any updates or changes to implementation details.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group