cancel
Showing results for 
Search instead for 
Did you mean: 
Community Articles
Dive into a collaborative space where members like YOU can exchange knowledge, tips, and best practices. Join the conversation today and unlock a wealth of collective wisdom to enhance your experience and drive success.
cancel
Showing results for 
Search instead for 
Did you mean: 

Establishing Trust relationship for Databricks on AWS

gdschld
New Contributor

Hello.

Our databricks is on Azure. We are trying to connect with AWS S3 as an external source from Unity Catalog.

We have followed all steps given here, is there anything additional required?
https://docs.databricks.com/aws/en/connect/unity-catalog/cloud-storage/storage-credentials
  1. Create IAM Role in AWS account, policy, allowing access on s3 buckets.
  2. Establish Trust connection for the IAM role
  3. Create storage credential in Databricks, and collect the external Id
  4. Modify 1 with stsAssume policy with externalid as databricks Identifier created in step 3.

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Action": [
            "s3:GetObject",
            "s3:PutObject",
            "s3:DeleteObject",
            "s3:ListBucket",
            "s3:GetBucketLocation",
            "s3:ListBucketMultipartUploads",
            "s3:ListMultipartUploadParts",
            "s3:AbortMultipartUpload"
          ],
          "Resource": ["arn:aws:s3:::<BUCKET>/*", "arn:aws:s3:::<BUCKET>"],
          "Effect": "Allow"
        },
        {
          "Action": ["sts:AssumeRole"],
          "Resource": ["arn:aws:iam::<AWS-ACCOUNT-ID>:role/<AWS-IAM-ROLE-NAME>"],
          "Effect": "Allow"
        }
      ]
    }


    Trust Policy on IAM Role, validates perfect.

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Effect": "Allow",
          "Principal": {
            "AWS": [
              "arn:aws:iam::414351767826:role/unity-catalog-prod-UCMasterRole-14S5ZJVKOTYTL",
              "arn:aws:iam::<YOUR-AWS-ACCOUNT-ID>:role/<THIS-ROLE-NAME>"
            ]
          },
          "Action": "sts:AssumeRole",
          "Condition": {
            "StringEquals": {
              "sts:ExternalId": "<STORAGE-CREDENTIAL-EXTERNAL-ID>"
            }
          }
        }
      ]
    }

    However, databricks storage credential still gives error below on trying to validate this connection.

    On databricks : Catalog : Credentials : Databricks Storage Credential --> Validate Configuration

    Error

    Failed - Assume Role
    Skipped - Self Assume Role
    Skipped - ExternalID Condition
    
    Missing Permissions
    Failed to get credentials: the AWS IAM role in the credential is not configured correctly. Please contact your account admin to update the configuration

    What could be the reason for the validation to be erroring out? Thanks in advance.

    Is there any additional step like setting up credentials?
 

 

2 REPLIES 2

BigRoux
Databricks Employee
Databricks Employee

Here are some helpful tips/guidance to help you troubleshoot:

 

To resolve the errors you're encountering when validating the Databricks storage credential connected to AWS S3, consider the following potential causes and steps based on existing documentation:

Potential Causes of Validation Errors

  1. IAM Role Configuration Issues:
    • Ensure that the IAM role is correctly set up in AWS for self-assuming capabilities. If your IAM role does not trust itself, the validation will fail. Databricks requires that the IAM role be self-assuming, which means it must include its own ARN in its trust policy.
  2. External ID Mismatches:
    • Verify that the external ID in the IAM role's trust policy matches the external ID you collected from the storage credential settings in Databricks. Any discrepancy here will cause the validation to fail.
  3. Missing Permissions:
    • Check that the IAM policies attached to the IAM role include all necessary actions. The policies must allow the role to perform the required S3 actions (e.g., s3:GetObject, s3:PutObject, etc.) as well as sts:AssumeRole. In your provided policy, ensure there are no syntax issues preventing proper authorization.

Recommended Steps to Troubleshoot

  1. Inspect Trust Policy:
    • Go to your IAM role in AWS and confirm that the trust policy is correctly configured for self-assumption. It should look similar to this: json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Principal": { "AWS": [ "arn:aws:iam::414351767826:role/unity-catalog-prod-UCMasterRole-14S5ZJVKOTYTL", "arn:aws:iam::<YOUR-AWS-ACCOUNT-ID>:role/<THIS-ROLE-NAME>" ] }, "Action": "sts:AssumeRole", "Condition": { "StringEquals": { "sts:ExternalId": "<STORAGE-CREDENTIAL-EXTERNAL-ID>" } } } ] }
    • Ensure the ExternalId is correct.
  2. Verify IAM Policies:
    • Ensure the policies attached to your IAM role grant sufficient permissions not just for the S3 bucket but also for assuming the role itself. Double-check the syntax and completeness of your policy.
  3. Review IAM Role and External ID Matches:
    • Reconfirm that the external ID used in the trust policy matches the one specified in Databricks. This is crucial for successful role assumption.
  4. Consult Documentation:
    • Consider reviewing additional details in the Databricks documentation on creating storage credentials, particularly details regarding self-assuming roles and permissions needed for integration.
 
Cheers, Lou.

Pat
Esteemed Contributor

Hi @gdschld ,

what ID have you used here:

"sts:ExternalId": "<STORAGE-CREDENTIAL-EXTERNAL-ID>"

I haven't done this for some time and got a bit confused with this STORAGE-CREDENTIAL-EXTERNAL_ID. I used to put there Databricks Account ID.
I found this, it might help: 
https://kb.databricks.com/unity-catalog/aws-s3-storage-credential-validation-failure-due-to-external...

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now