Autoloader: Cross-account bucket Assume role access denied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-06-2025 12:16 AM
Hi everyone!
I have a Databricks instance profile role that has permission to assume a role in another AWS account to access an S3 bucket in that account.
When I try to assume the role using boto3, it correctly reads the Databricks AWS credentials, assumes the role, and is able to read the S3 file without any errors.
However, when I try to use this role in a cloudFiles stream, it fails with an
AccessDenied error.java.nio.file.AccessDeniedException: <bucket> getFileStatus on <bucket> AmazonS3Exception: Forbidden; request: HEAD <bucket> customer-info {} Hadoop 3.3.6, 403 Forbidden
Here is sample code I am using:
options_dict = {
"cloudFiles.roleArn": role_arn,
"cloudFiles.format": "json",
"cloudFiles.schemaLocation": <schema_path>,
"cloudFiles.includeExistingFiles": "true",
"multiLine": "true"
}
df = (spark.readStream
.format("cloudFiles")
.options(**options_dict)
.load("<bucket>")
)
:
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-06-2025 07:40 AM
Hi @deng_dev , Greetings!
In the above error message, you will see a request ID in it, so can you please share that Request ID with the AWS Team to check why this request is getting denied as this looks like a permission issue.
Please let me know if this helps and leave a like if this information is useful, followups are appreciated.
Kudos
Ayushi

