<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to read data from S3 Access Point by pyspark? in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/how-to-read-data-from-s3-access-point-by-pyspark/m-p/50831#M28910</link>
    <description>&lt;P&gt;I'm reaching out to seek assistance as I navigate an issue. Currently, I'm trying to read JSON files from an S3 Multi-Region Access Point using a Databricks notebook. While reading directly from the S3 bucket presents no challenges, I encounter an "java.nio.file.AccessDeniedException" error when attempting to read from the Multi-Region Access Point. Any guidance or support you can provide would be greatly appreciated.&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;spark.read.json&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"s3://&amp;lt;bucket-name&amp;gt;/"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;SPAN&gt;.display&lt;/SPAN&gt;&lt;SPAN&gt;(). --- No issue&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;spark.read.json&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"s3://accesspoint/&amp;lt;ap-name&amp;gt;.mrap"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;SPAN&gt;.display&lt;/SPAN&gt;&lt;SPAN&gt;() --&amp;nbsp;java.nio.file.AccessDeniedException&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
    <pubDate>Fri, 10 Nov 2023 19:48:56 GMT</pubDate>
    <dc:creator>shrestha-rj</dc:creator>
    <dc:date>2023-11-10T19:48:56Z</dc:date>
    <item>
      <title>How to read data from S3 Access Point by pyspark?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-read-data-from-s3-access-point-by-pyspark/m-p/17636#M11614</link>
      <description>&lt;P&gt;I want to read data from s3 access point.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I successfully accessed using boto3 client to data through s3 access point.&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;s3 = boto3.resource('s3')ap = s3.Bucket('arn:aws:s3:[region]:[aws account id]:accesspoint/[S3 Access Point name]')for obj in ap.objects.all():  print(obj.key)  print(obj.get()['Body'].read())&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I tried read access through s3 access point by pyspark.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;But, I dose not access to s3 access point with error of " java.lang.NullPointerException: null uri host. This can be caused by unencoded / in the password string".&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;# Can't access to data
# https://[s3-accesspoint-name]-[accountid].s3-accesspoint.[region].amazonaws.com/[file path]
df = spark.read.csv('s3a://arn:aws:s3:[region]:[aws account id]:accesspoint/[S3 access point name]/[data file path]')
df.show()&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;How to access through the S3 Access Point to data?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;S3 Access Point&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://" alt="https://" target="_blank"&gt;https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-access-points.html&lt;/A&gt; &lt;/P&gt;</description>
      <pubDate>Sat, 17 Jul 2021 23:07:17 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-read-data-from-s3-access-point-by-pyspark/m-p/17636#M11614</guid>
      <dc:creator>yutaro_ono1_558</dc:creator>
      <dc:date>2021-07-17T23:07:17Z</dc:date>
    </item>
    <item>
      <title>Re: How to read data from S3 Access Point by pyspark?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-read-data-from-s3-access-point-by-pyspark/m-p/17638#M11616</link>
      <description>&lt;P&gt;Did you get to following up on this issue, Kaniz?&lt;/P&gt;</description>
      <pubDate>Fri, 21 Jan 2022 08:58:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-read-data-from-s3-access-point-by-pyspark/m-p/17638#M11616</guid>
      <dc:creator>Niclas</dc:creator>
      <dc:date>2022-01-21T08:58:19Z</dc:date>
    </item>
    <item>
      <title>Re: How to read data from S3 Access Point by pyspark?</title>
      <link>https://community.databricks.com/t5/data-engineering/how-to-read-data-from-s3-access-point-by-pyspark/m-p/50831#M28910</link>
      <description>&lt;P&gt;I'm reaching out to seek assistance as I navigate an issue. Currently, I'm trying to read JSON files from an S3 Multi-Region Access Point using a Databricks notebook. While reading directly from the S3 bucket presents no challenges, I encounter an "java.nio.file.AccessDeniedException" error when attempting to read from the Multi-Region Access Point. Any guidance or support you can provide would be greatly appreciated.&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;spark.read.json&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"s3://&amp;lt;bucket-name&amp;gt;/"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;SPAN&gt;.display&lt;/SPAN&gt;&lt;SPAN&gt;(). --- No issue&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;SPAN&gt;spark.read.json&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"s3://accesspoint/&amp;lt;ap-name&amp;gt;.mrap"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;SPAN&gt;.display&lt;/SPAN&gt;&lt;SPAN&gt;() --&amp;nbsp;java.nio.file.AccessDeniedException&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Fri, 10 Nov 2023 19:48:56 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/how-to-read-data-from-s3-access-point-by-pyspark/m-p/50831#M28910</guid>
      <dc:creator>shrestha-rj</dc:creator>
      <dc:date>2023-11-10T19:48:56Z</dc:date>
    </item>
  </channel>
</rss>

