- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-08-2025 08:58 PM
Hi Dnirmania,
How are you doing today?, As per my understanding, you’re definitely on the right track, and it’s great that you’re connecting AWS S3 with Azure Databricks—it’s a useful setup but can be a bit tricky. From what you shared, the code looks mostly fine, but make sure your S3 bucket allows access from outside AWS—sometimes the bucket policy needs to be updated to allow cross-cloud access. Also, double-check that your fs.s3a.endpoint matches the region where your S3 bucket is located (for example, s3.eu-west-1.amazonaws.com if it's in Ireland). And for using Autoloader, instead of spark.read.csv(), switch to spark.readStream.format("cloudFiles") and provide .option("cloudFiles.format", "csv") with your S3 path—this will allow incremental loading of new files. Lastly, it’s a good idea to store your AWS credentials securely in Databricks secrets instead of hardcoding them. Let me know if you want help setting up the correct S3 bucket policy or configuring Autoloader fully!
Regards,
Brahma