Error creating Unity Catalog external table
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2025 11:55 AM
I tried creating an external table from a partitioned parquet folder in Unity Catalog.
Initially, I created the table from the Data Ingestion UI. It worked but only a tiny portion of the table was actually loaded.
Next, I tried running a SQL DDL CREATE TABLE IF NOT EXISTS prod.reporting.send_date_summary ( ... ) USING parquet LOCATION "s3:/...." I tried with LOCATION as the root folder containing all subfolders and files, as well as the root folder with wild cards for subfolders. Querying this table threw the following error: ExecutionException: com.databricks.sql.managedcatalog.UnityCatalogServiceException: [RequestId=b2e92078-29ce-4383-9501-3aa5adaa5e93 ErrorClass=BAD_REQUEST] Request URL is not related to this EXTERNAL table. The request url = Some(s3://....). The table storage path is Some(s3://...).
Can you tell me how external tables should be created in Unity Catalog from partitioned parquet files?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2025 12:18 PM
Hi @nwong,
Ensure that the Cloud Storage URI provided is complete and correct, including the scheme, since error looks to be complaining about it.
Can you try using this: https://docs.databricks.com/en/tables/external-partition-discovery.html
CREATE OR REPLACE TABLE <catalog>.<schema>.<table-name>
USING <format>
PARTITIONED BY (<partition-column-list>)
LOCATION 's3://<bucket-path>/<table-directory>';
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2025 03:45 PM
Thanks for the suggestion.
I added
PARTITIONED BY (partition_date)
but not all of the data is loaded.
The parquet files are not stored in Hive-style partitioning. The folder format is
s3://bucket_name/send_date_summary/report/yyyy/mm/dd/*.parquet
where yyyy/mm/dd is the partition_date.
Do I need to manually add each partition? There are hundreds.
Is there an easier way to do this?
And will setting partition log automatically discover new partitions that are added to the folder?
SET spark.databricks.nonDelta.partitionLog.enabled = true;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-03-2025 04:12 PM
I just tried manually adding a partition
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-07-2025 02:13 AM
Hi.
We are experiencing the exact same problem.
Does anyone have a solution to the issue?
Or do we really need to rename our partitioning to year=2024,month=01,day=01?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-06-2025 02:20 AM
You can use recursiveFileLookup in OPTIONS, as shown in this topic: https://community.databricks.com/t5/data-engineering/external-table-from-external-location/td-p/6924...

