Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-25-2021 05:54 AM
I am reading data from S3 from a Databricks cluster and the read operation seldom fails with 403 permission errors. Restarting the cluster fixes my issue.
Labels:
- Labels:
-
Databricks Cluster
-
Iam
-
S3
1 ACCEPTED SOLUTION
Accepted Solutions
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-25-2021 05:56 AM
The main reason for this behavior is :
- AWS keys are used in addition to the IAM role. Using global init scripts to set the AWS keys can cause this behavior.
- The IAM role has the required permission to access the S3 data, but AWS keys are set in the Spark configuration. For example, setting spark.hadoop.fs.s3a.secret.key can conflict with the IAM role.
- Setting AWS keys at the environment level on the driver node from an interactive cluster through a notebook.
1 REPLY 1
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-25-2021 05:56 AM
The main reason for this behavior is :
- AWS keys are used in addition to the IAM role. Using global init scripts to set the AWS keys can cause this behavior.
- The IAM role has the required permission to access the S3 data, but AWS keys are set in the Spark configuration. For example, setting spark.hadoop.fs.s3a.secret.key can conflict with the IAM role.
- Setting AWS keys at the environment level on the driver node from an interactive cluster through a notebook.