Hey @YoshikiFujiwara , I took a look and have some meaningful feedback for you.
Short version: your diagnosis is right, and what it points to is an unsupported path, not a mistake in your IAM setup. Amazon S3 Access Points are not a supported target for Unity Catalog external locations on AWS today. The current AWS docs only cover external locations against standard S3 bucket paths (s3://...). There's no public doc or release note that lists S3 Access Point ARNs as a supported target, and nothing that describes special configuration for them. The behavior you captured is the known signature of this gap.
Why it behaves this way
Unity Catalog doesn't hand the compute your full IAM role. It uses credential vending (down-scoping). When a query touches the external location, UC calls AWS STS AssumeRole and attaches a session policy scoped to the requested path. Your effective S3 permission is the intersection of two things:
- Your IAM role's identity policy, which you've correctly set to
s3:* on the access point ARN.
- UC's generated session policy, which is built from standard
s3://bucket/prefix semantics.
That intersection is where it breaks. Standard bucket object operations authorize against arn:aws:s3:::bucket/prefix/*. Access point object operations require a different ARN namespace: arn:aws:s3:<region>:<acct>:accesspoint/<name>/object/<prefix>/*. UC's down-scoped session policy doesn't emit those access point object ARNs, and it scopes ListObjectsV2 to the root prefix only.
That explains each symptom you saw:
- Top-level
ls and explicit single-file reads match the narrow root-prefix scope, so they succeed.
- Subdirectory listing needs prefix-level
ListObjectsV2 that the session policy never grants, so you get UC_CLOUD_STORAGE_ACCESS_FAILURE / UNAUTHORIZED_ACCESS.
CREATE TABLE runs an internal write and validation that the session policy denies, so you get AccessDenied.
UC validates just enough to accept the location, but the full external-location and table workflow assumes bucket-style addressing, not access point ARN addressing. This is also why Athena, Snowflake, and EMR work against the same access point. They use the role credentials directly (or are access-point aware) and don't impose UC's path-scoped session policy.
A caution about the access_point field
If you go looking, you'll find an access_point attribute that injects the AP ARN into the session policy and partially improves things. It's what makes top-level listing and file reads succeed. Don't build on it. Per Databricks Support, that field was never released as GA and has been removed from the documentation. The partial success is a side effect of incomplete internal handling, not a supported code path. It won't get you subdirectory listing or table creation.
What I'd do from here
Your source is FSx for NetApp ONTAP exposed through an S3 Access Point, so there's no plain S3 bucket underneath to register directly. With that constraint, here's the path I'd take:
- Keep the AWS-native engines for in-place reads. Athena, Snowflake, and EMR are fine wherever you don't need UC governance.
- Stage into standard S3, then govern in UC. This is your DataSync workaround, refined. To address the duplication concern, make it incremental instead of a full copy: land data in a standard S3 bucket and use Auto Loader (
cloudFiles) to ingest only new files into UC managed or external tables. That restores the full governance layer (lineage, fine-grained ACLs, row and column masking) the access point path can't give you today.
- File a feature request with your Databricks account team for native S3 Access Point support in UC credential vending. Attach the repro details you've already collected and track it under a support case. This is a real product gap, not user error.
The bottom line: no IAM tweak will fix this, because the block is in UC's session-policy generation, not your role. Until S3 Access Points are a supported external-location target, standard S3 with Auto Loader into UC tables is the durable, fully governed pattern.
Cheers, Louis.