<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Accessing data from a legacy hive metastore workspace on a new Unity Catalog workspace in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/accessing-data-from-a-legacy-hive-metastore-workspace-on-a-new/m-p/65120#M32743</link>
    <description>&lt;P&gt;Your aim is to access&amp;nbsp; external S3 tables from a Unity Catalog workspace without data duplication and keeping data updates synchronized.&amp;nbsp;Configure external location permissions. This ensure that&amp;nbsp;both your Unity Catalog and Hive metastore workspaces have read permissions for the S3 location containing your tables. This allows both workspaces to access the same underlying data without duplication. Then create external tables in Unity Catalog with '&lt;SPAN&gt;CREATE EXTERNAL TABLE ..' syntax, specifying the S3 location and schema of the existing table. This creates pointers to the existing data in S3 without copying it. Remember whether Hive or otherwise, external tables are just pointers, often used for ETL by overwriting the existing data (in your case on S3). Both Hive an Unity Catalog control the schema and point to data location but do not control the data itself. You can then access the data from both Hive and Unity Catalog.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;HTH&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Sun, 31 Mar 2024 08:52:06 GMT</pubDate>
    <dc:creator>MichTalebzadeh</dc:creator>
    <dc:date>2024-03-31T08:52:06Z</dc:date>
    <item>
      <title>Accessing data from a legacy hive metastore workspace on a new Unity Catalog workspace</title>
      <link>https://community.databricks.com/t5/data-engineering/accessing-data-from-a-legacy-hive-metastore-workspace-on-a-new/m-p/64932#M32706</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;For the purposes of testing I'm interested in creating a &lt;STRONG&gt;new workspace&lt;/STRONG&gt; &lt;STRONG&gt;with Unity Catalog enabled&lt;/STRONG&gt;, and from there I'd like to access (&lt;STRONG&gt;external - S3&lt;/STRONG&gt;) tables on an &lt;STRONG&gt;existing legacy hive metastore workspace&lt;/STRONG&gt; (not UC enabled). The goal is for both workspaces would point to the same underlying S3 external location.&lt;/P&gt;&lt;P&gt;As a requirement I do not want to duplicate data &amp;amp; ideally updates to data on the legacy workspace would be reflected to tables surfaced through UC.&lt;/P&gt;&lt;P&gt;I was considering the possibility of shallow cloning, however from my understanding that is not possible across UC &amp;amp; hive metastore.&lt;/P&gt;&lt;P&gt;Does anybody have experience/recommendations on doing this? Looking through some databricks documentation I'm mostly finding information on upgrading a legacy workspace only.&lt;/P&gt;&lt;P&gt;#unitycatalog #hivemetastore&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 28 Mar 2024 16:01:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/accessing-data-from-a-legacy-hive-metastore-workspace-on-a-new/m-p/64932#M32706</guid>
      <dc:creator>hossein_kolahdo</dc:creator>
      <dc:date>2024-03-28T16:01:02Z</dc:date>
    </item>
    <item>
      <title>Re: Accessing data from a legacy hive metastore workspace on a new Unity Catalog workspace</title>
      <link>https://community.databricks.com/t5/data-engineering/accessing-data-from-a-legacy-hive-metastore-workspace-on-a-new/m-p/64970#M32716</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/9"&gt;@Retired_mod&lt;/a&gt;&amp;nbsp;From looking at the documentation none address my particular use case which I illustrated (2 workspaces on one account, 1 with UC and the other not). Was there a particular part on any of the docs you're suggesting can help here?&lt;/P&gt;</description>
      <pubDate>Thu, 28 Mar 2024 20:54:08 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/accessing-data-from-a-legacy-hive-metastore-workspace-on-a-new/m-p/64970#M32716</guid>
      <dc:creator>hossein_kolahdo</dc:creator>
      <dc:date>2024-03-28T20:54:08Z</dc:date>
    </item>
    <item>
      <title>Re: Accessing data from a legacy hive metastore workspace on a new Unity Catalog workspace</title>
      <link>https://community.databricks.com/t5/data-engineering/accessing-data-from-a-legacy-hive-metastore-workspace-on-a-new/m-p/65120#M32743</link>
      <description>&lt;P&gt;Your aim is to access&amp;nbsp; external S3 tables from a Unity Catalog workspace without data duplication and keeping data updates synchronized.&amp;nbsp;Configure external location permissions. This ensure that&amp;nbsp;both your Unity Catalog and Hive metastore workspaces have read permissions for the S3 location containing your tables. This allows both workspaces to access the same underlying data without duplication. Then create external tables in Unity Catalog with '&lt;SPAN&gt;CREATE EXTERNAL TABLE ..' syntax, specifying the S3 location and schema of the existing table. This creates pointers to the existing data in S3 without copying it. Remember whether Hive or otherwise, external tables are just pointers, often used for ETL by overwriting the existing data (in your case on S3). Both Hive an Unity Catalog control the schema and point to data location but do not control the data itself. You can then access the data from both Hive and Unity Catalog.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;HTH&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 31 Mar 2024 08:52:06 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/accessing-data-from-a-legacy-hive-metastore-workspace-on-a-new/m-p/65120#M32743</guid>
      <dc:creator>MichTalebzadeh</dc:creator>
      <dc:date>2024-03-31T08:52:06Z</dc:date>
    </item>
  </channel>
</rss>

