<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: S3 Permission errors in Get Started Discussions</title>
    <link>https://community.databricks.com/t5/get-started-discussions/s3-permission-errors/m-p/40517#M5666</link>
    <description>&lt;P&gt;UPDATE:&lt;/P&gt;&lt;P&gt;The permission problems only exist when the Cluster's (compute's) Access mode is "Shared No Isolation".&amp;nbsp; When the Access Mode is either "Shared" or "Single User" then the IAM configuration seems to apply as expected.&amp;nbsp; When set to "Shared No Isolation" it's as if the IAM settings are not being applied, and then a bunch of 403 errors are thrown&lt;BR /&gt;&lt;BR /&gt;Also, and this is interesting, the setting for "Instance Profile" can be either "None" or the ARN for the steps 6 described in the link below, it makes no difference.&amp;nbsp;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;FONT color="#FF9900"&gt;&amp;nbsp;&lt;/FONT&gt;&lt;FONT color="#0000FF"&gt;&lt;A href="https://docs.databricks.com/en/aws/iam/instance-profile-tutorial.html" target="_blank" rel="noopener nofollow noreferrer"&gt;https://docs.databricks.com/en/aws/iam/instance-profile-tutorial.html&lt;/A&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sat, 19 Aug 2023 00:48:43 GMT</pubDate>
    <dc:creator>winojoe</dc:creator>
    <dc:date>2023-08-19T00:48:43Z</dc:date>
    <item>
      <title>S3 Permission errors</title>
      <link>https://community.databricks.com/t5/get-started-discussions/s3-permission-errors/m-p/40502#M5665</link>
      <description>&lt;P&gt;Hello&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Background -&lt;/P&gt;&lt;P&gt;I have an S3 datalake set up prior to signing up with Databricks.&amp;nbsp; I'm still in my evaluation period.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I'm trying to read the contents of an S3 bucket but am getting all kinds of permission problems.&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is the command in the notebook:&lt;/P&gt;&lt;P class=""&gt;&lt;FONT face="andale mono,times"&gt;dbutils.fs.ls("s3://hidden-bucket-name")&lt;/FONT&gt;&lt;/P&gt;&lt;P class=""&gt;This is the result:&lt;/P&gt;&lt;P class=""&gt;&lt;FONT face="andale mono,times" size="2"&gt;java.nio.file.AccessDeniedException: s3://hidden-bucket-name: &lt;FONT color="#3366FF"&gt;getFileStatus&lt;/FONT&gt; on s3://hidden-bucket-name: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied; request: GET &lt;A href="https://hidden-bucketname.s3.us-west-1.amazonaws.com" target="_blank" rel="noopener"&gt;https://hidden-bucketname.s3.us-west-1.amazonaws.com&lt;/A&gt;&lt;SPAN class=""&gt;&amp;nbsp; &lt;/SPAN&gt;{key=[], key=[false], key=[2], key=[2], key=[/]} Hadoop 3.3.4, aws-sdk-java/1.12.390 Linux/5.15.0-1040-aws OpenJDK_64-Bit_Server_VM/25.372-b07 java/1.8.0_372 scala/2.12.15 kotlin/1.6.0 vendor/Azul_Systems,_Inc. cfg/retry-mode/legacy com.amazonaws.services.s3.model.ListObjectsV2Request; Request ID: VAMK134SSX7FM3VA, Extended Request ID: l/PmddVkv7otkOZnhZgeSc2HU9yiKej9ZsJ96xq3gQ+b5uQKDbw8QknQD8zJETYJM78V6jd5K74=, Cloud Provider: AWS, Instance ID: i-0feca026b7707fb3b credentials-provider: &lt;FONT color="#3366FF"&gt;com.amazonaws.auth.AnonymousAWSCredentials credential-header: no-credential-header&lt;/FONT&gt; signature-present: false (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: VAMK134SSX7FM3VA; S3 Extended Request ID: l/PmddVkv7otkOZnhZgeSc2HU9yiKej9ZsJ96xq3gQ+b5uQKDbw8QknQD8zJETYJM78V6jd5K74=; Proxy: null), S3 Extended Request ID: l/PmddVkv7otkOZnhZgeSc2HU9yiKej9ZsJ96xq3gQ+b5uQKDbw8QknQD8zJETYJM78V6jd5K74=:AccessDenied&lt;/FONT&gt;&lt;/P&gt;&lt;P class=""&gt;There is a lot going on here. I &lt;FONT color="#3366FF"&gt;emphasized&lt;/FONT&gt; things that caught my eye&lt;/P&gt;&lt;P class=""&gt;getFileStatus.&amp;nbsp; There isn't a specific action in the IAM actions for this, so I'm not sure how to remedy this&lt;/P&gt;&lt;P class=""&gt;I followed a lot of articles&lt;/P&gt;&lt;P class=""&gt;&lt;FONT size="5"&gt;Resolutions tried:&lt;/FONT&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;A href="https://kb.databricks.com/en_US/security/forbidden-access-to-s3-data" target="_blank" rel="noopener"&gt;https://kb.databricks.com/en_US/security/forbidden-access-to-s3-data&lt;/A&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;FONT face="arial black,avant garde" size="5"&gt;&lt;STRONG&gt;Cause&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P class=""&gt;Below are the common causes:&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;AWS keys are used in addition to the IAM role. Using global init scripts to set the AWS keys can cause this behavior.&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;FONT color="#FF9900"&gt;&lt;EM&gt;&lt;SPAN class=""&gt;&amp;nbsp;&lt;/SPAN&gt;I do have AWS Keys provisioned for local spark execution against remote s3 buckets.&lt;SPAN class=""&gt;&amp;nbsp; &lt;/SPAN&gt;I can’t imagine that this should impact an instance of spark running in a data bricks notebook&lt;/EM&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;&lt;I&gt;The IAM role has the required permission to access the S3 data, but AWS keys are set in the Spark configuration. For example, setting&amp;nbsp;spark.hadoop.fs.s3a.secret.key&amp;nbsp;can conflict with the IAM role.&lt;/I&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;FONT color="#FF9900"&gt;&lt;I&gt;See above, I do have this configuration for local spark execution&lt;/I&gt;.&lt;SPAN class=""&gt;&amp;nbsp;But as I noted above, there shouldn't be any impact on notebooks running in spark&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;Setting AWS keys at environment level on the driver node from an interactive cluster through a notebook.&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;FONT color="#FF9900"&gt;&lt;I&gt;Not setting keys at environment level&lt;/I&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;DBFS mount points were created earlier with AWS keys and now trying to access using an IAM role.&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;FONT color="#FF9900"&gt;&lt;I&gt;Not applicable.&lt;/I&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;The files are written outside Databricks, and the bucket owner does not have read permission (see&amp;nbsp;&lt;A href="https://docs.databricks.com/administration-guide/cloud-configurations/aws/instance-profiles.html#iam-step-7" target="_blank" rel="noopener"&gt;&lt;SPAN class=""&gt;Step 7: Update cross-account S3 object ACLs&lt;/SPAN&gt;&lt;/A&gt;).&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;FONT color="#FF9900"&gt;The Files contained in the bucket were written outside Databricks, but the I followed the steps here:&amp;nbsp;&lt;/FONT&gt;&lt;FONT color="#0000FF"&gt;&lt;A href="https://docs.databricks.com/en/aws/iam/instance-profile-tutorial.html" target="_blank" rel="noopener"&gt;https://docs.databricks.com/en/aws/iam/instance-profile-tutorial.html&lt;/A&gt;.&amp;nbsp;&lt;/FONT&gt;&lt;U&gt;&lt;FONT color="#FF9900"&gt;As a side note, there is no “Step 7” in the procedure above.&lt;/FONT&gt;&lt;/U&gt;&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;The IAM role is not attached to the cluster.&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;FONT color="#FF9900"&gt;&lt;I&gt;Not sure how to attach an IAM role to the data bricks cluster … there is an instance profile attached to the workspace though&lt;/I&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;The IAM role with read permission was attached, but you are trying to perform a write operation. That is, the IAM role does not have adequate permission for the operation you are trying to perform.&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;FONT color="#FF9900"&gt;&lt;I&gt;Not applicable&lt;/I&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;FONT face="arial black,avant garde" size="5"&gt;&lt;STRONG&gt;Solution&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P class=""&gt;Below are the recommendations and best practices to avoid this issue:&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;Use IAM roles instead of AWS keys.&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;FONT color="#FF9900"&gt;Done.&lt;/FONT&gt;&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;If you are trying to switch the configuration from AWS keys to IAM roles, unmount the DBFS mount points for S3 buckets created using AWS keys and remount using the IAM role.&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;FONT color="#FF9900"&gt;Not applicable. &lt;SPAN class=""&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;Avoid using global init script to set AWS keys. Always use a cluster-scoped init script if required.&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;FONT color="#FF9900"&gt;Not applicable not using global init script in data bricks&lt;/FONT&gt;&lt;/P&gt;&lt;UL class=""&gt;&lt;LI&gt;Avoid setting AWS keys in a notebook or cluster Spark configuration.&lt;/LI&gt;&lt;/UL&gt;&lt;P class=""&gt;&lt;FONT color="#FF9900"&gt;I do have AWS keys provisioned buy only used when running spark locall&lt;/FONT&gt;&lt;/P&gt;&lt;P class=""&gt;&lt;FONT face="arial black,avant garde" size="5"&gt;Closing&lt;/FONT&gt;&lt;/P&gt;&lt;P class=""&gt;Access to S3 folders are a blocker to my organization to moving forward with Databricks.&amp;nbsp; I appreciate any resolution the community can provice&lt;/P&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 18 Aug 2023 18:37:10 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/s3-permission-errors/m-p/40502#M5665</guid>
      <dc:creator>winojoe</dc:creator>
      <dc:date>2023-08-18T18:37:10Z</dc:date>
    </item>
    <item>
      <title>Re: S3 Permission errors</title>
      <link>https://community.databricks.com/t5/get-started-discussions/s3-permission-errors/m-p/40517#M5666</link>
      <description>&lt;P&gt;UPDATE:&lt;/P&gt;&lt;P&gt;The permission problems only exist when the Cluster's (compute's) Access mode is "Shared No Isolation".&amp;nbsp; When the Access Mode is either "Shared" or "Single User" then the IAM configuration seems to apply as expected.&amp;nbsp; When set to "Shared No Isolation" it's as if the IAM settings are not being applied, and then a bunch of 403 errors are thrown&lt;BR /&gt;&lt;BR /&gt;Also, and this is interesting, the setting for "Instance Profile" can be either "None" or the ARN for the steps 6 described in the link below, it makes no difference.&amp;nbsp;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;FONT color="#FF9900"&gt;&amp;nbsp;&lt;/FONT&gt;&lt;FONT color="#0000FF"&gt;&lt;A href="https://docs.databricks.com/en/aws/iam/instance-profile-tutorial.html" target="_blank" rel="noopener nofollow noreferrer"&gt;https://docs.databricks.com/en/aws/iam/instance-profile-tutorial.html&lt;/A&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 19 Aug 2023 00:48:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/get-started-discussions/s3-permission-errors/m-p/40517#M5666</guid>
      <dc:creator>winojoe</dc:creator>
      <dc:date>2023-08-19T00:48:43Z</dc:date>
    </item>
  </channel>
</rss>

