cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

an autoloader in file notification mode to get files from S3 on AWS -Error

Chris_Konsur
New Contributor III

I configured an autoloader in file notification mode to get files from S3 on AWS.

spark.readStream\
.format("cloudFiles")\
.option("cloudFiles.format", "json")\
.option("cloudFiles.inferColumnTypes", "true")\
.option("cloudFiles.schemaLocation", "dbfs:/auto-loader/schemas/")\
.option("cloudFiles.useNotifications", "true")\
.option("includeExistingFiles", "true")\
.option("multiLine", "true")\
.option("inferSchema", "true")\
.load("s3://orcus-rave-bucket/temp/cludcad_incident3")\
.writeStream\
.option("checkpointLocation", "dbfs:/auto-loader/checkpoint04/")\
.trigger(availableNow=True)\
.table("al_table3")

I configured IAM rule based on the document URL: https://learn.microsoft.com/en-us/azure/databricks/ingestion/auto-loader/file-notification-mode#--re...

4 REPLIES 4

Chris_Konsur
New Contributor III

Here is an error: 

I get the following error:

com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [BasicAWSCredentialsProvider: Access key or secret key is null, com.amazonaws.auth.InstanceProfileCredentialsProvider@5ab84d23: The requested metadata is not found at http://169.254.169.254/latest/meta-data/iam/security-credentials/]

Selz
New Contributor II

I have the same error after adding the IAM permissions noted in the file notification mode documentation. Were you able to find a solution?

Selz
New Contributor II

In case anyone else stumbles across this, I was able to fix my issue by setting up an instance profile with the file notification permissions and attaching the instance profile to the job cluster. It wasn't clear from the documentation that the file notification permissions can't be set up with a role and job using storage credentials. This article helped: https://medium.com/@mattwinmill88/deploying-a-databricks-aws-end-to-end-pipeline-using-terraform-921...

aliehs0510
New Contributor II

Hi @Selz ,

 I currently have the same error when running autoloader on file notification mode. I have done the following steps:
1. setup instance profile with file notification permissions

2. added the instance profile on databricks workspace , settings->security-> instance profiles
3. configured the job compute policy to add the config 

"aws_attributes.instance_profile_arn": {
"type": "allowlist",
"values": [
"arn:aws:iam::<account_id>:instance-profile/<my instance profile role>"
],
"isOptional": true
},

 however I'm still getting the same error. wondering if I'm I did something wrong or missed a step. I appreciate your guidance on this. 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group