Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
I have read access to an S3 bucket in an AWS account that is not mine. For more than a year I've had a job successfully reading from that bucket using dbutils.fs.mount(...) and sqlContext.read.json(...). Recently the job started failing with the exception: "com.amazonaws.services.s3.model.AmazonS3Exception: The bucket is in this region: us-east-1. Please use this region to retry the request." on the sqlContext.read.json() command. My Databricks Cloud platform is in us-west-2, and the bucket may have been moved, but as far as I understand from this question this shouldn't be a problem: https://forums.databricks.com/questions/416/does-my-s3-data-need-to-be-in-the-same-aws-region.html . I have no problem accessing the bucket with boto3 (with no need to specify region). I've also tried setting the 'spark.hadoop.fs.s3a.endpoint' of the cluster to 's3.us-east-1.amazonaws.com' . Surprisingly this resulted in the same error on the mount() command saying that the bucket is in the 'us-west-2' region.
I'm a bit confused as to the possible causes of this error and would be happy for some pointers,
. Can you try switching configuring the client to use
us-east-1
?
I hope it will work for you. Thank you
Connect with Databricks Users in Your Area
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.