<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Access denied error while reading file from S3 to spark in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/access-denied-error-while-reading-file-from-s3-to-spark/m-p/48889#M28406</link>
    <description>&lt;P&gt;I'm seeing the access denied error from spark cluster while reading s3 file into notebook.&lt;/P&gt;&lt;P&gt;Running on personal single user compute with LTS 13.3 ML.&lt;/P&gt;&lt;P&gt;configs setup looks like this:&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;spark.conf.&lt;/SPAN&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"spark.hadoop.fs.s3a.access.key"&lt;/SPAN&gt;&lt;SPAN&gt;, access_id)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;spark.conf.&lt;/SPAN&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"spark.hadoop.fs.s3a.secret.key"&lt;/SPAN&gt;&lt;SPAN&gt;, access_key)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;spark.conf.&lt;/SPAN&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"spark.hadoop.fs.s3a.session.token"&lt;/SPAN&gt;&lt;SPAN&gt;, session_token)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;spark.conf.&lt;/SPAN&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"spark.hadoop.fs.s3a.aws.credentials.provider"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;spark.conf.&lt;/SPAN&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"spark.hadoop.fs.s3a.endpoint"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"s3.us-east-1.amazonaws.com"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;Code block looks like this&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;file_location = &lt;/SPAN&gt;&lt;SPAN&gt;"s3://bucket_name/"&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;file_type = &lt;/SPAN&gt;&lt;SPAN&gt;"parquet"&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;df = spark.read.&lt;/SPAN&gt;&lt;SPAN&gt;format&lt;/SPAN&gt;&lt;SPAN&gt;(file_type).load(file_location)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;display(df.head())&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;FONT size="4"&gt;Error that I'm getting:&lt;/FONT&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;java.nio.file.AccessDeniedException: s3://bucket_name/xxx.parquet: getFileStatus ons3://bucket_name/xxx.parquet: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden; request: HEAD &lt;A href="https://bucket_name.parquet" target="_blank"&gt;https://bucket_name.parquet&lt;/A&gt; {} Hadoop 3.3.4, aws-sdk-java/1.12.390 Linux/5.15.0-1045-aws OpenJDK_64-Bit_Server_VM/25.372-b07 java/1.8.0_372 scala/2.12.15 kotlin/1.6.0 vendor/Azul_Systems,_Inc. cfg/retry-mode/legacy com.amazonaws.services.s3.model.GetObjectMetadataRequest; Request ID: RD3ZAB9V0G6C4W7B, Extended Request ID: 7BDXsMzY0O6RwMdKfFLlGuHlw2AkKj0+O2U6vL2UnF1nXzu9sDsVtPVH4qXv5sYzLf8vV65sNdU=, Cloud Provider: AWS, Instance ID: i-06f065a5b0db0e707 credentials-provider: com.amazonaws.auth.AnonymousAWSCredentials credential-header: no-credential-header signature-present: false (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: RD3ZAB9V0G6C4W7B; S3 Extended Request ID: 7BDXsMzY0O6RwMdKfFLlGuHlw2AkKj0+O2U6vL2UnF1nXzu9sDsVtPVH4qXv5sYzLf8vV65sNdU=; Proxy: null), S3 Extended Request ID: 7BDXsMzY0O6RwMdKfFLlGuHlw2AkKj0+O2U6vL2UnF1nXzu9sDsVtPVH4qXv5sYzLf8vV65sNdU=:403 Forbidden&lt;/SPAN&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;Please help.&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
    <pubDate>Tue, 10 Oct 2023 22:19:01 GMT</pubDate>
    <dc:creator>Monika_Bagyal</dc:creator>
    <dc:date>2023-10-10T22:19:01Z</dc:date>
    <item>
      <title>Access denied error while reading file from S3 to spark</title>
      <link>https://community.databricks.com/t5/data-engineering/access-denied-error-while-reading-file-from-s3-to-spark/m-p/48889#M28406</link>
      <description>&lt;P&gt;I'm seeing the access denied error from spark cluster while reading s3 file into notebook.&lt;/P&gt;&lt;P&gt;Running on personal single user compute with LTS 13.3 ML.&lt;/P&gt;&lt;P&gt;configs setup looks like this:&lt;/P&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;spark.conf.&lt;/SPAN&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"spark.hadoop.fs.s3a.access.key"&lt;/SPAN&gt;&lt;SPAN&gt;, access_id)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;spark.conf.&lt;/SPAN&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"spark.hadoop.fs.s3a.secret.key"&lt;/SPAN&gt;&lt;SPAN&gt;, access_key)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;spark.conf.&lt;/SPAN&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"spark.hadoop.fs.s3a.session.token"&lt;/SPAN&gt;&lt;SPAN&gt;, session_token)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;spark.conf.&lt;/SPAN&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"spark.hadoop.fs.s3a.aws.credentials.provider"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;spark.conf.&lt;/SPAN&gt;&lt;SPAN&gt;set&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;"spark.hadoop.fs.s3a.endpoint"&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;"s3.us-east-1.amazonaws.com"&lt;/SPAN&gt;&lt;SPAN&gt;)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;Code block looks like this&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;file_location = &lt;/SPAN&gt;&lt;SPAN&gt;"s3://bucket_name/"&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;file_type = &lt;/SPAN&gt;&lt;SPAN&gt;"parquet"&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;df = spark.read.&lt;/SPAN&gt;&lt;SPAN&gt;format&lt;/SPAN&gt;&lt;SPAN&gt;(file_type).load(file_location)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;display(df.head())&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;&lt;FONT size="4"&gt;Error that I'm getting:&lt;/FONT&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;SPAN&gt;java.nio.file.AccessDeniedException: s3://bucket_name/xxx.parquet: getFileStatus ons3://bucket_name/xxx.parquet: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden; request: HEAD &lt;A href="https://bucket_name.parquet" target="_blank"&gt;https://bucket_name.parquet&lt;/A&gt; {} Hadoop 3.3.4, aws-sdk-java/1.12.390 Linux/5.15.0-1045-aws OpenJDK_64-Bit_Server_VM/25.372-b07 java/1.8.0_372 scala/2.12.15 kotlin/1.6.0 vendor/Azul_Systems,_Inc. cfg/retry-mode/legacy com.amazonaws.services.s3.model.GetObjectMetadataRequest; Request ID: RD3ZAB9V0G6C4W7B, Extended Request ID: 7BDXsMzY0O6RwMdKfFLlGuHlw2AkKj0+O2U6vL2UnF1nXzu9sDsVtPVH4qXv5sYzLf8vV65sNdU=, Cloud Provider: AWS, Instance ID: i-06f065a5b0db0e707 credentials-provider: com.amazonaws.auth.AnonymousAWSCredentials credential-header: no-credential-header signature-present: false (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: RD3ZAB9V0G6C4W7B; S3 Extended Request ID: 7BDXsMzY0O6RwMdKfFLlGuHlw2AkKj0+O2U6vL2UnF1nXzu9sDsVtPVH4qXv5sYzLf8vV65sNdU=; Proxy: null), S3 Extended Request ID: 7BDXsMzY0O6RwMdKfFLlGuHlw2AkKj0+O2U6vL2UnF1nXzu9sDsVtPVH4qXv5sYzLf8vV65sNdU=:403 Forbidden&lt;/SPAN&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV&gt;&lt;FONT size="2"&gt;&lt;SPAN&gt;Please help.&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 10 Oct 2023 22:19:01 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/access-denied-error-while-reading-file-from-s3-to-spark/m-p/48889#M28406</guid>
      <dc:creator>Monika_Bagyal</dc:creator>
      <dc:date>2023-10-10T22:19:01Z</dc:date>
    </item>
  </channel>
</rss>

