Reading data from S3 in Azure Databricks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2024 04:44 AM
Is it possible to create an external volume in Azure Databricks that points to an external S3 bucket so that I can read files for processing? Or is it only limited to ADLSv2?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2024 05:33 AM
I don't think so - but there are some other ways to "mount" it as external storage (some time ago I have used sc._jsc.hadoopConfiguration().set - not sure if It does still work ) or maybe things like s3fs ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2024 06:15 AM
Hi @ossinova ,
I think currently it's not possible to create external volume to S3 bucket in UC. But still you can access S3 data using following techniques:
- Access S3 buckets with URIs and AWS keys
- Access S3 with open-source Hadoop options
- You can mount S3 bucket (but this method is depracted)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-16-2024 05:50 PM
Yep, I'm keen to see this functionality as well.
I think it is reasonable to expect external locations can be on diverse storage types (at least the big players). I can nicely control access to azure storage in UC but not S3.

