cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Governance
Join discussions on data governance practices, compliance, and security within the Databricks Community. Exchange strategies and insights to ensure data integrity and regulatory compliance.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Can we assume the path to the managed tables in the hive_metastore is reliable?

giohappy
New Contributor III

Managed tables are stored under the /user/hive/warehouse, which is also mentioned in the documentation

In our workflow, we use that path to read the parquet files from outside (through databricks connector). Can we assume this path is reliable, or is it an "implementation detail" that might change at any time?

3 REPLIES 3

giohappy
New Contributor III

Yes, that link was also mentioned in my question. The point is if our pipeline can always assume that the path is where the parquet files for the managed tables are expected to be, or it's just an internal detail that could change at any time.

giohappy
New Contributor III

In our case we haven't configured or created the metastore directly. We're relying on the default metastore, which is where the tables are written when we do:

df.write.format("delta").mode("overwrite").saveAsTable(output_table_name)

I haven't found anything saying that the path of the default metastore might change unexpectedly. By the way, I don't even found something stating the opposite ๐Ÿ™‚

MoJaMa
Databricks Employee
Databricks Employee

That path is reliable but we would recommend not using that path in general. 
That's your workspace root storage.

Your data should be in a cloud path of your choosing (s3/adls/gcs) so that you can separate your data out by BU/Project/team etc based on what buckets each one owns.

When you create a schema in HMS you can do

CREATE SCHEMA A LOCATION 's3 path';

Then when you create a table in that Schema, it will be a managed table in a sub-path of the above path.
Now it's not tied to workspace root storage.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group