cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

DLT workflow failing to read files from AWS S3

sudhanshu1
New Contributor III

Hi All, I am trying to read streams directly from AWS S3. I set the instance profile , but when i run the workflow it fails with below error

"No AWS Credentials provided by TemporaryAWSCredentialsProvider : shaded.databricks.org.apache.hadoop.fs.s3a.CredentialInitializationException: Access key, secret key or session token is unset: "

I added below to my cluster

fs.s3a.aws.credentials.provider org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider

fs.s3a.access.key <AccessKeyId>

fs.s3a.secret.key <SecretAccessKey>

It still fails with same error. Could some one please help me how to pass these for DLT workflows

1 ACCEPTED SOLUTION

Accepted Solutions

Hi Vivian,

Thanks for your help. I am happy to inform that it's working now . I think problem was in assigning proper roles and access to instance profile(in AWS) which i created for this purpose . Once i added few more rules , it started working .

Thanks again for all your help.

View solution in original post

4 REPLIES 4

Vivian_Wilfred
Honored Contributor
Honored Contributor

Hi @SUDHANSHU RAJ​ is UC enabled on this workspace? What is the access mode set on the cluster?

Is this coming from the metastore or directly when you read from S3? Is the S3 cross-account?

Dear Vivian,

UC is not enabled on this workspace . I am using Instance profile set up as per databricks document .

S3 is set up for cross account and as i said , i am able to run dbutils.fs.ls("s3a://zuk-comparis-poc/")

But when i run the workflow which envoks a delta notebook ,it gives me this error.

This is a standard cluster , so i have not enabled IAM passthrough.

Am i missing something . Thanks in advance

@SUDHANSHU RAJ​ Can you please share the pipeline settings in JSON and also the cluster policy JSON? If this works on a standard cluster but not from a DLT pipeline, we need to verify the DLT pipeline settings for the cluster.

Hi Vivian,

Thanks for your help. I am happy to inform that it's working now . I think problem was in assigning proper roles and access to instance profile(in AWS) which i created for this purpose . Once i added few more rules , it started working .

Thanks again for all your help.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.