Hi @Retired_mod
I have seen numerous post by you. Thanks for continuously providing support. Can you or your colleagues help on this.
We have a basic user which assumes a role with S3 policy to a specific bucket. When we try to read the bucket from Databricks python notebook using boto3 all works fine.
As soon as we use autoloader it fails with an exception
Common Code
# AWS credentials
aws_access_key_id = ""
aws_secret_access_key = ""
role_arn = "arn:aws:iam::XXXXXXX:role/roleName"
mfa_serial_number = "XXXX"
mfa_code = input("Enter MFA code: ") # Prompt user for the MFA code
# Create a Boto3 STS client with the provided credentials
sts_client = boto3.client(
'sts',
aws_access_key_id=aws_access_key_id,
aws_secret_access_key=aws_secret_access_key
)
# Assume Role with Boto3 including MFA
assumed_role = sts_client.assume_role(
RoleArn=role_arn,
RoleSessionName='session-name',
SerialNumber=mfa_serial_number,
TokenCode=mfa_code
)
credentials = assumed_role['Credentials']
boto3 is working - Here is the code for spark
import boto3
import os
from pyspark.sql import SparkSession
# Create a new Spark session
spark = SparkSession.builder \
.appName("S3AssumeRoleSession") \
.config("spark.hadoop.fs.s3a.assumed.role.arn", role_arn) \
.config("spark.hadoop.fs.s3a.aws.credentials.provider", "org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider") \
.config("spark.hadoop.fs.s3a.access.key", credentials['AccessKeyId']) \
.config("spark.hadoop.fs.s3a.secret.key", credentials['SecretAccessKey']) \
.config("spark.hadoop.fs.s3a.session.token", credentials['SessionToken']) \
.getOrCreate()
# Use Spark Session to read a JSON file from S3
path = "s3a://abc/xyz.json"
df = spark.read.json(path)
display(df)
Exception