I would like to access S3 data in databricks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-14-2022 09:42 PM
Hi all,
I am new to the databricks. I am trying to get the data from S3. The video tutoirals from the streaming platforms are accessing via access ID and secret access key. However, databricks is throwing a different options. I dont know what to fill here. Could you please explain or direct me to the right tutorials
# File location and type
file_location = "{{upload_location}}"
file_type = "{{file_type}}"
# CSV options
infer_schema = "{{infer_schema}}"
first_row_is_header = "{{first_row_is_header}}"
delimiter = "{{delimiter}}"
# The applied options are for CSV files. For other file types, these will be ignored.
df = spark.read.format(file_type) \
.option("inferSchema", infer_schema) \
.option("header", first_row_is_header) \
.option("sep", delimiter) \
.load(file_location)
display(df)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-15-2022 01:54 AM
There are two ways in Databricks to read from S3. You can either read data using an IAM Role or read data using Access Keys.
you can find some examples here:
https://docs.databricks.com/_static/notebooks/data-import/s3.html
https://docs.databricks.com/administration-guide/cloud-configurations/aws/instance-profiles.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-17-2022 09:48 PM
Thank you Mohit, I think I find it still challenging because I am not clear on the fundamentals I believe. Let me try to figure out some other way. Thank you for sharing the answer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-15-2022 08:06 PM
You can do following:
- use your AWS Secret Keys and Access Key to mount an S3 bucket to DBFS.
- Create an instance profile and access via that
- Use KMS in S3 bucket and then use the same KMS to mount bucket to DBFS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-04-2022 10:52 PM
Hi @Karthikeyan Palanisamy
Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help.
We'd love to hear from you.
Thanks!