- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-20-2023 03:33 PM
Based on the instructions for creating an external table (see: https://docs.databricks.com/data-governance/unity-catalog/create-tables.html#create-a-table) I had assumed that external tables were a way to add an existing object store to Unity Catalog and that once defined they would work just like managed tables. The documentation doesn't seem to specifically describe external tables have behaving differently. In but then I read these two references today:
- “Only files in the exact directory are read; the read is not recursive”
- “When you create a table using this method, the storage path is read only once, to prevent duplication of records…”
- Labels:
-
Unity Catalog
Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-20-2023 07:46 PM
@Mark Miller :
External tables in Databricks do not automatically receive external updates. When you create an external table in Databricks, you are essentially registering the metadata for an existing object store in Unity Catalog, which allows you to query the data using SQL.
When you query an external table, Databricks reads the data from the external storage location specified in the table definition. However, Databricks does not monitor the external storage location for updates or changes to the data. If you add new files to the external storage location or modify the existing files, you need to manually update the external table metadata in Unity Catalog using the
MSCK REPAIR TABLE command to add the new partitions or files.
The documentation you mentioned is correct that when you create an external table using the method described, the storage path is read only once to prevent duplication of records. This means that if you add new files to the external storage location after creating the external table, these files will not be included in the table until you update the metadata using
MSCK REPAIR TABLE.
In summary, external tables in Databricks do not automatically receive external updates. You need to manually update the metadata using the MSCK REPAIR TABLE command to add new partitions or files to the table.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-20-2023 07:46 PM
@Mark Miller :
External tables in Databricks do not automatically receive external updates. When you create an external table in Databricks, you are essentially registering the metadata for an existing object store in Unity Catalog, which allows you to query the data using SQL.
When you query an external table, Databricks reads the data from the external storage location specified in the table definition. However, Databricks does not monitor the external storage location for updates or changes to the data. If you add new files to the external storage location or modify the existing files, you need to manually update the external table metadata in Unity Catalog using the
MSCK REPAIR TABLE command to add the new partitions or files.
The documentation you mentioned is correct that when you create an external table using the method described, the storage path is read only once to prevent duplication of records. This means that if you add new files to the external storage location after creating the external table, these files will not be included in the table until you update the metadata using
MSCK REPAIR TABLE.
In summary, external tables in Databricks do not automatically receive external updates. You need to manually update the metadata using the MSCK REPAIR TABLE command to add new partitions or files to the table.

