What is the most efficient way to start an S3 bucket?
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-14-2021 11:25 AM
1 REPLY 1
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-23-2021 10:10 AM
So if you've got an S3 bucket with your data in it, the first thing you'll need to do is connect it to a Databricks workspace to grant access. Then you can start querying the contents of the bucket from notebooks (or running jobs) by using clusters (compute resources) within the Databricks workspace to execute commands.
Here's a guide on the docs site that walks through the process to connect a bucket: https://docs.databricks.com/data/data-sources/aws/amazon-s3.html
Although it shares several options, I'd recommend using instance profiles and mounting via DBFS for simplicity.

