Create external table using multiple paths/locations
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-07-2023 02:30 PM
I want to create an external table from more than a single path. I have configured my storage creds and added an external location, and I can successfully create a table using the following code;
create table test.base.Example
using csv
options (
header = "true"
)
location 'abfss://test@exampleblob.dfs.core.windows.net/2022/08/data/'
But I have lots of data partitioned by month and date so I'm wondering if there is anyway to import data from multiple paths using wildcards or something similar as outlined in the code below?
create table test.base.Example
using csv
options (
header = "true"
)
location 'abfss://test@exampleblob.dfs.core.windows.net/*/*/data/'
Many thanks
Tim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-06-2024 06:16 AM - edited 08-06-2024 06:29 AM
Hi @Retired_mod, hope you are well.
This is still not working for me.
For example:
- Full path works fine:
- abfss://<container>@<storage_account>.dfs.core.windows.net/automation/<subfolder1>/<subfolder2>/<subfolder3>/part_0_0001.csv
- But as soon as a use wildcard, any of below options fail:
- abfss://<container>@<storage_account>.dfs.core.windows.net/automation/<subfolder1>/<subfolder2>/<subfolder3>/*.csv (on File name)
- abfss://<container>@<storage_account>.dfs.core.windows.net/automation/<subfolder1>/<subfolder2>/*/part_00001.csv (on Subfolder)
ERROR:
Failure to initialize configuration for storage account <storage_account>.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.key
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-09-2025 07:19 AM
You do not have to create all the partition folders yourslef.
You just need to specify the parent folder like
CREATE OR REPLACE TABLE <catalog>.<schema>.<table-name> USING <format> PARTITIONED BY (<partition-column-list>) LOCATION 's3://<bucket-path>/<table-directory>';
some more read - https://docs.databricks.com/en/tables/external-partition-discovery.html#manually-specify-paths-for-o...

