cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

an autoloader in file notification mode to get files from S3 on AWS -Error

Chris_Konsur
New Contributor III

I configured an autoloader in file notification mode to get files from S3 on AWS.

spark.readStream\
.format("cloudFiles")\
.option("cloudFiles.format", "json")\
.option("cloudFiles.inferColumnTypes", "true")\
.option("cloudFiles.schemaLocation", "dbfs:/auto-loader/schemas/")\
.option("cloudFiles.useNotifications", "true")\
.option("includeExistingFiles", "true")\
.option("multiLine", "true")\
.option("inferSchema", "true")\
.load("s3://orcus-rave-bucket/temp/cludcad_incident3")\
.writeStream\
.option("checkpointLocation", "dbfs:/auto-loader/checkpoint04/")\
.trigger(availableNow=True)\
.table("al_table3")

I configured IAM rule based on the document URL: https://learn.microsoft.com/en-us/azure/databricks/ingestion/auto-loader/file-notification-mode#--re...

4 REPLIES 4

Chris_Konsur
New Contributor III

Here is an error: 

I get the following error:

com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [BasicAWSCredentialsProvider: Access key or secret key is null, com.amazonaws.auth.InstanceProfileCredentialsProvider@5ab84d23: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]

Selz
New Contributor II

I have the same error after adding the IAM permissions noted in the file notification mode documentation. Were you able to find a solution?

Selz
New Contributor II

In case anyone else stumbles across this, I was able to fix my issue by setting up an instance profile with the file notification permissions and attaching the instance profile to the job cluster. It wasn't clear from the documentation that the file notification permissions can't be set up with a role and job using storage credentials. This article helped: https://medium.com/@mattwinmill88/deploying-a-databricks-aws-end-to-end-pipeline-using-terraform-921...

aliehs0510
New Contributor II

Hi @Selz ,

 I currently have the same error when running autoloader on file notification mode. I have done the following steps:
1. setup instance profile with file notification permissions

2. added the instance profile on databricks workspace , settings->security-> instance profiles
3. configured the job compute policy to add the config 

"aws_attributes.instance_profile_arn": {
"type": "allowlist",
"values": [
"arn:aws:iam::<account_id>:instance-profile/<my instance profile role>"
],
"isOptional": true
},

 however I'm still getting the same error. wondering if I'm I did something wrong or missed a step. I appreciate your guidance on this.