<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Can external tables be created backed by current cloud files without ingesting files in Databric in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/can-external-tables-be-created-backed-by-current-cloud-files/m-p/113151#M44437</link>
    <description>&lt;P&gt;I have read this doc, but I had no luck, not working.&lt;/P&gt;</description>
    <pubDate>Thu, 20 Mar 2025 13:58:54 GMT</pubDate>
    <dc:creator>Jennifer</dc:creator>
    <dc:date>2025-03-20T13:58:54Z</dc:date>
    <item>
      <title>Can external tables be created backed by current cloud files without ingesting files in Databricks?</title>
      <link>https://community.databricks.com/t5/data-engineering/can-external-tables-be-created-backed-by-current-cloud-files/m-p/113018#M44391</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;We have huge amount of parquet files in s3 with the path pattern &amp;lt;bucket&amp;gt;/&amp;lt;customer&amp;gt;/yyyy/mm/dd/hh/.*.parquet.&lt;/P&gt;&lt;P&gt;The question is can I create a external table in Unity Catalog from this external location without actually ingesting the files? Like what it can be done in AWS Athena:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;CREATE EXTERNAL TABLE tbl(
&amp;lt;schema&amp;gt;
)
PARTITION BY (...)
LOCATION
's3://&amp;lt;bucket&amp;gt;/'&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;I have tried a couple of ways to create such a table resulting in errors either as "there is not delta commit logs" or "DELTA_CREATE_TABLE_WITH_NON_EMPTY_LOCATION".&amp;nbsp;&lt;/P&gt;&lt;P&gt;In other words, I don't want to save raw data in Databricks again since we have them in s3. However, I want to be able to query the raw data from Databricks.&lt;/P&gt;</description>
      <pubDate>Wed, 19 Mar 2025 08:27:30 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-external-tables-be-created-backed-by-current-cloud-files/m-p/113018#M44391</guid>
      <dc:creator>Jennifer</dc:creator>
      <dc:date>2025-03-19T08:27:30Z</dc:date>
    </item>
    <item>
      <title>Re: Can external tables be created backed by current cloud files without ingesting files in Databric</title>
      <link>https://community.databricks.com/t5/data-engineering/can-external-tables-be-created-backed-by-current-cloud-files/m-p/113150#M44436</link>
      <description>&lt;P&gt;Hi,&lt;BR /&gt;&lt;BR /&gt;Yes, Databricks does support creating external tables whose storage is not managed by Unity Catalog. Have you seen the following page of documentation: &lt;A href="https://docs.databricks.com/aws/en/tables/external" target="_blank"&gt;https://docs.databricks.com/aws/en/tables/external&lt;/A&gt;? It describes how to create external tables, and has an example notebook.&lt;/P&gt;</description>
      <pubDate>Thu, 20 Mar 2025 13:49:42 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-external-tables-be-created-backed-by-current-cloud-files/m-p/113150#M44436</guid>
      <dc:creator>Nik_Vanderhoof</dc:creator>
      <dc:date>2025-03-20T13:49:42Z</dc:date>
    </item>
    <item>
      <title>Re: Can external tables be created backed by current cloud files without ingesting files in Databric</title>
      <link>https://community.databricks.com/t5/data-engineering/can-external-tables-be-created-backed-by-current-cloud-files/m-p/113151#M44437</link>
      <description>&lt;P&gt;I have read this doc, but I had no luck, not working.&lt;/P&gt;</description>
      <pubDate>Thu, 20 Mar 2025 13:58:54 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-external-tables-be-created-backed-by-current-cloud-files/m-p/113151#M44437</guid>
      <dc:creator>Jennifer</dc:creator>
      <dc:date>2025-03-20T13:58:54Z</dc:date>
    </item>
    <item>
      <title>Re: Can external tables be created backed by current cloud files without ingesting files in Databric</title>
      <link>https://community.databricks.com/t5/data-engineering/can-external-tables-be-created-backed-by-current-cloud-files/m-p/113156#M44442</link>
      <description>&lt;P&gt;Can you share the commands you have tried, the errors they produced, and which Databricks Runtime versions you used? Those will help to debug.&lt;/P&gt;</description>
      <pubDate>Thu, 20 Mar 2025 14:09:26 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-external-tables-be-created-backed-by-current-cloud-files/m-p/113156#M44442</guid>
      <dc:creator>Nik_Vanderhoof</dc:creator>
      <dc:date>2025-03-20T14:09:26Z</dc:date>
    </item>
    <item>
      <title>Re: Can external tables be created backed by current cloud files without ingesting files in Databric</title>
      <link>https://community.databricks.com/t5/data-engineering/can-external-tables-be-created-backed-by-current-cloud-files/m-p/113160#M44445</link>
      <description>&lt;P&gt;&amp;nbsp;i think the issue is that you are trying to create a DELTA table in Unity catalog from an Parquet source without converting it to Delta format first.&lt;BR /&gt;&lt;BR /&gt;As Unity catalog will not allow delta table to be created in an non-empty location. Since you want to expose directly in UC without writting delta log, you can try by creating external table&lt;BR /&gt;%sqlCREATE EXTERNAL TABLE catalog_name.schema_name.table_name&lt;/P&gt;&lt;P&gt;USING PARQUET&lt;/P&gt;&lt;P&gt;LOCATION 'S3 location&lt;BR /&gt;but you will lose Versioning and time travel.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Hope this will work otherwise&amp;nbsp;&lt;BR /&gt;convert parquet into delta in place first then try to register external table. But i am sure i have done directly exposing in catalog by simply registering external table&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/30638"&gt;@Jennifer&lt;/a&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 20 Mar 2025 14:38:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-external-tables-be-created-backed-by-current-cloud-files/m-p/113160#M44445</guid>
      <dc:creator>Data_Mavericks</dc:creator>
      <dc:date>2025-03-20T14:38:02Z</dc:date>
    </item>
    <item>
      <title>Re: Can external tables be created backed by current cloud files without ingesting files in Databric</title>
      <link>https://community.databricks.com/t5/data-engineering/can-external-tables-be-created-backed-by-current-cloud-files/m-p/113198#M44463</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/154427"&gt;@Data_Mavericks&lt;/a&gt;, I have actually tried using your suggested way, but the table is empty after creating it. Not sure why.&lt;/P&gt;&lt;P&gt;A recently Databricks feature which can &lt;A href="https://docs.databricks.com/aws/en/data-governance/unity-catalog/hms-federation/hms-federation-glue" target="_blank" rel="noopener"&gt;federate AWS Glue&lt;/A&gt; may help me out since we have these files in s3 defined in Glue as tables already.&lt;/P&gt;</description>
      <pubDate>Thu, 20 Mar 2025 17:57:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-external-tables-be-created-backed-by-current-cloud-files/m-p/113198#M44463</guid>
      <dc:creator>Jennifer</dc:creator>
      <dc:date>2025-03-20T17:57:45Z</dc:date>
    </item>
    <item>
      <title>Re: Can external tables be created backed by current cloud files without ingesting files in Databric</title>
      <link>https://community.databricks.com/t5/data-engineering/can-external-tables-be-created-backed-by-current-cloud-files/m-p/113405#M44525</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/30638"&gt;@Jennifer&lt;/a&gt;&amp;nbsp;i am not sure about this i have done through ADLS Gen2. For your approach managed role will be required i think. But if you can share some more details i think if you managed to expose in UC and not seeing data then i have recently done by creating a VIEW in UC where i am directly exposing my source files to UC.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;P&gt;-- Replace with your actual catalog, schema, and view names&lt;BR /&gt;CREATE OR REPLACE VIEW Enviourment.Schema.tablename&lt;/P&gt;&lt;P&gt;AS&lt;BR /&gt;SELECT&lt;BR /&gt;Record_type,&lt;BR /&gt;ACCOUNT_NUMBER,&lt;BR /&gt;CUSTOMER_NUMBER,&lt;BR /&gt;MEMBERSHIP_TYPE_CODE,&lt;BR /&gt;POINTS_BALANCE,&lt;BR /&gt;POINTS_EARNED,&lt;BR /&gt;ACCOUNT_TERMINATED_DATE,&lt;BR /&gt;FROM parquet.`s3://your-bucket-name/path/to/parquet/files/`&lt;BR /&gt;-- Optional filter to ensure data integrity, if needed&lt;BR /&gt;WHERE ACCOUNT_NUMBER IS NOT NULL&lt;BR /&gt;&lt;BR /&gt;I think simplest way to expose if you dont want to hold the anywhere in between. As per my understanding if you want to create a table its best practice to hold data somewhere in external location then register.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;Let me know if i correctly understood your issue&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 24 Mar 2025 10:06:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/can-external-tables-be-created-backed-by-current-cloud-files/m-p/113405#M44525</guid>
      <dc:creator>Data_Mavericks</dc:creator>
      <dc:date>2025-03-24T10:06:34Z</dc:date>
    </item>
  </channel>
</rss>

