<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Can we assume the path to the managed tables in the hive_metastore is reliable? in Data Governance</title>
    <link>https://community.databricks.com/t5/data-governance/can-we-assume-the-path-to-the-managed-tables-in-the-hive/m-p/37553#M1090</link>
    <description>&lt;P&gt;Yes, that link was also mentioned in my question. The point is if our pipeline can always assume that the path is where the parquet files for the managed tables are expected to be, or it's just an internal detail that could change at any time.&lt;/P&gt;</description>
    <pubDate>Thu, 13 Jul 2023 09:55:19 GMT</pubDate>
    <dc:creator>giohappy</dc:creator>
    <dc:date>2023-07-13T09:55:19Z</dc:date>
    <item>
      <title>Can we assume the path to the managed tables in the hive_metastore is reliable?</title>
      <link>https://community.databricks.com/t5/data-governance/can-we-assume-the-path-to-the-managed-tables-in-the-hive/m-p/37547#M1088</link>
      <description>&lt;P&gt;Managed tables are stored under the&amp;nbsp;&lt;STRONG&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;/user/hive/warehouse&lt;/FONT&gt;&lt;/STRONG&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;, which is also &lt;A href="https://docs.databricks.com/dbfs/root-locations.html#what-is-stored-in-the-userhivewarehouse-directory" target="_blank" rel="noopener"&gt;mentioned in the documentation&lt;/A&gt;.&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT face="arial,helvetica,sans-serif"&gt;In our workflow, we use that path to read the parquet files from outside (through databricks connector). Can we assume this path is reliable, or is it an "implementation detail" that might change at any time?&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jul 2023 08:27:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-governance/can-we-assume-the-path-to-the-managed-tables-in-the-hive/m-p/37547#M1088</guid>
      <dc:creator>giohappy</dc:creator>
      <dc:date>2023-07-13T08:27:26Z</dc:date>
    </item>
    <item>
      <title>Re: Can we assume the path to the managed tables in the hive_metastore is reliable?</title>
      <link>https://community.databricks.com/t5/data-governance/can-we-assume-the-path-to-the-managed-tables-in-the-hive/m-p/37553#M1090</link>
      <description>&lt;P&gt;Yes, that link was also mentioned in my question. The point is if our pipeline can always assume that the path is where the parquet files for the managed tables are expected to be, or it's just an internal detail that could change at any time.&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jul 2023 09:55:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-governance/can-we-assume-the-path-to-the-managed-tables-in-the-hive/m-p/37553#M1090</guid>
      <dc:creator>giohappy</dc:creator>
      <dc:date>2023-07-13T09:55:19Z</dc:date>
    </item>
    <item>
      <title>Re: Can we assume the path to the managed tables in the hive_metastore is reliable?</title>
      <link>https://community.databricks.com/t5/data-governance/can-we-assume-the-path-to-the-managed-tables-in-the-hive/m-p/37614#M1095</link>
      <description>&lt;P&gt;In our case we haven't configured or created the metastore directly. We're relying on the default metastore, which is where the tables are written when we do:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;df.write.format("delta").mode("overwrite").saveAsTable(output_table_name)&lt;/LI-CODE&gt;&lt;P&gt;I haven't found anything saying that the path of the default metastore might change unexpectedly. By the way, I don't even found something stating the opposite &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jul 2023 08:01:20 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-governance/can-we-assume-the-path-to-the-managed-tables-in-the-hive/m-p/37614#M1095</guid>
      <dc:creator>giohappy</dc:creator>
      <dc:date>2023-07-14T08:01:20Z</dc:date>
    </item>
    <item>
      <title>Re: Can we assume the path to the managed tables in the hive_metastore is reliable?</title>
      <link>https://community.databricks.com/t5/data-governance/can-we-assume-the-path-to-the-managed-tables-in-the-hive/m-p/102675#M2325</link>
      <description>&lt;P&gt;That path is reliable but we would recommend not using that path in general.&amp;nbsp;&lt;BR /&gt;That's your workspace root storage.&lt;/P&gt;
&lt;P&gt;Your data should be in a cloud path of your choosing (s3/adls/gcs) so that you can separate your data out by BU/Project/team etc based on what buckets each one owns.&lt;/P&gt;
&lt;P&gt;When you create a schema in HMS you can do&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;CREATE SCHEMA A LOCATION 's3 path';&lt;/STRONG&gt;&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;Then when you create a table in that Schema, it will be a managed table in a sub-path of the above path.&lt;BR /&gt;Now it's not tied to workspace root storage.&lt;/P&gt;</description>
      <pubDate>Thu, 19 Dec 2024 15:35:08 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-governance/can-we-assume-the-path-to-the-managed-tables-in-the-hive/m-p/102675#M2325</guid>
      <dc:creator>MoJaMa</dc:creator>
      <dc:date>2024-12-19T15:35:08Z</dc:date>
    </item>
  </channel>
</rss>

