Amazon returns a 403 error code when trying to access an S3 Bucket
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-17-2023 02:15 PM
Hey! So far I have followed along with the Configure S3 access with instance profiles article to grant my cluster access to an S3 bucket. I have also made sure to disable IAM role passthrough on the cluster.
Upon querying the bucket through a notebook using:
dbutils.fs.ls("s3://<bucket-name>/")
I receive a 403: Access denied message back from Amazon. I've double checked and AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are not present in the spark environment variables. I've also checked through this article - https://kb.databricks.com/en_US/security/forbidden-access-to-s3-data and have made sure to follow all the best practices.
Does anyone have any recommendations on what I should check or test?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-24-2023 03:42 PM
can you check the workspaces VPCs
route table if your using s3 gateway endpoint can you if the the gateway endpoint prefixlist is added explicitly to the workspace vpc subnets route table if it’s via traditional NAT/IG can you double triple check the route table gateway entries? If it’s s3 interface endpoints can you check if it’s appropriately tied to the workspaces vpc!
all in double down cross check on the networking(subnets,sg,nacls,firewall if any) of the workspace vpc if not also check for the user access has any Denys via endpoint or iam , or bucket policies.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-17-2023 10:21 PM
Hi - having the same issue. Just wondering if you were able to resolve it? If so, how?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-18-2023 05:51 PM
I had the same issue and I found a solution
For me, the permission problems only exist when the Cluster's (compute's) Access mode is "Shared No Isolation". When the Access Mode is either "Shared" or "Single User" then the IAM configuration seems to apply as expected. When set to "Shared No Isolation" it's as if the IAM settings are not being applied, and then a bunch of 403 errors are thrown
Also, and this is interesting, the setting for "Instance Profile" can be either "None" or the ARN for the steps 6 described in the link below, it makes no difference.
https://docs.databricks.com/en/aws/iam/instance-profile-tutorial.html