cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Amazon returns a 403 error code when trying to access an S3 Bucket

danatsafe
New Contributor

Hey! So far I have followed along with the Configure S3 access with instance profiles article to grant my cluster access to an S3 bucket. I have also made sure to disable IAM role passthrough on the cluster.

Upon querying the bucket through a notebook using:

dbutils.fs.ls("s3://<bucket-name>/")

I receive a 403: Access denied message back from Amazon. I've double checked and AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are not present in the spark environment variables. I've also checked through this article - https://kb.databricks.com/en_US/security/forbidden-access-to-s3-data and have made sure to follow all the best practices.

Does anyone have any recommendations on what I should check or test?

3 REPLIES 3

User15848365773
New Contributor II

can you check the workspaces VPCs

route table if your using s3 gateway endpoint can you if the the gateway endpoint prefixlist is added explicitly to the workspace vpc subnets route table if itโ€™s via traditional NAT/IG can you double triple check the route table gateway entries? If itโ€™s s3 interface endpoints can you check if itโ€™s appropriately tied to the workspaces vpc!

all in double down cross check on the networking(subnets,sg,nacls,firewall if any) of the workspace vpc if not also check for the user access has any Denys via endpoint or iam , or bucket policies.

winojoe
New Contributor III

Hi - having the same issue.  Just wondering if you were able to resolve it? If so, how?

winojoe
New Contributor III

I had the same issue and I found a solution

For me, the permission problems only exist when the Cluster's (compute's) Access mode is "Shared No Isolation".  When the Access Mode is either "Shared" or "Single User" then the IAM configuration seems to apply as expected.  When set to "Shared No Isolation" it's as if the IAM settings are not being applied, and then a bunch of 403 errors are thrown

Also, and this is interesting, the setting for "Instance Profile" can be either "None" or the ARN for the steps 6 described in the link below, it makes no difference.  

 https://docs.databricks.com/en/aws/iam/instance-profile-tutorial.html

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group