I want to read data from s3 access point.
I successfully accessed using boto3 client to data through s3 access point.
s3 = boto3.resource('s3')ap = s3.Bucket('arn:aws:s3:[region]:[aws account id]:accesspoint/[S3 Access Point name]')for obj in ap.objects.all(): print(obj.key) print(obj.get()['Body'].read())
I tried read access through s3 access point by pyspark.
But, I dose not access to s3 access point with error of " java.lang.NullPointerException: null uri host. This can be caused by unencoded / in the password string".
# Can't access to data
# https://[s3-accesspoint-name]-[accountid].s3-accesspoint.[region].amazonaws.com/[file path]
df = spark.read.csv('s3a://arn:aws:s3:[region]:[aws account id]:accesspoint/[S3 access point name]/[data file path]')
df.show()
How to access through the S3 Access Point to data?
S3 Access Point
https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-access-points.html