Accessing the regions that are disabled by default in AWS from Databricks.
In AWS we have 4 regions that are disabled by default. You must first enable it before you can create and manage resources. The following Regions are disabled by default:
- Africa (Cape Town)
- Asia Pacific (Hong Kong)
- Europe (Milan)
- Middle East (Bahrain)
In Databricks, when we try to access/mount the bucket in these regions it will fail. The reason is that the AWS SDK we bundle in Databricks doesn't have the by-default-disabled region information. So auto-region detection will not work for these regions.
In such a special case, to access the buckets in the disabled regions, we have to provide an S3 endpoint along with the STS endpoint to mount the bucket.
%scala
dbutils.fs.mount("s3a://<Bucket-in-Milan>/", "/mnt/milan/",
extraConfigs = Map(
"fs.s3a.credentialsType" -> "AssumeRole",
"fs.s3a.stsAssumeRole.arn" -> "arn:aws:iam::000000000000:role/MyRole",
"fs.s3a.acl.default" -> "BucketOwnerFullControl",
"fs.s3a.endpoint" -> "s3.eu-south-1.amazonaws.com",
"fs.s3a.stsAssumeRole.stsEndpoint" -> "sts.eu-south-1.amazonaws.com"
)
)
Africa (Cape Town) : sts.af-south-1.amazonaws.com ; s3.af-south-1.amazonaws.com
Asia Pacific (Hong Kong) : sts.ap-east-1.amazonaws.com ; s3.ap-east-1.amazonaws.com
Europe (Milan) : sts.eu-south-1.amazonaws.com ; s3.eu-south-1.amazonaws.com
Middle East (Bahrain) : sts.me-south-1.amazonaws.com; s3.me-south-1.amazonaws.com