cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

I would like to access S3 data in databricks

Karthe
New Contributor III

Hi all,

I am new to the databricks. I am trying to get the data from S3. The video tutoirals from the streaming platforms are accessing via access ID and secret access key. However, databricks is throwing a different options. I dont know what to fill here. Could you please explain or direct me to the right tutorials

# File location and type

file_location = "{{upload_location}}"

file_type = "{{file_type}}"

# CSV options

infer_schema = "{{infer_schema}}"

first_row_is_header = "{{first_row_is_header}}"

delimiter = "{{delimiter}}"

# The applied options are for CSV files. For other file types, these will be ignored.

df = spark.read.format(file_type) \

 .option("inferSchema", infer_schema) \

 .option("header", first_row_is_header) \

 .option("sep", delimiter) \

 .load(file_location)

display(df)

4 REPLIES 4

Mohit_m
Valued Contributor II

There are two ways in Databricks to read from S3. You can either read data using an IAM Role or read data using Access Keys.

you can find some examples here:

https://docs.databricks.com/_static/notebooks/data-import/s3.html

https://docs.databricks.com/administration-guide/cloud-configurations/aws/instance-profiles.html

Karthe
New Contributor III

Thank you Mohit, I think I find it still challenging because I am not clear on the fundamentals I believe. Let me try to figure out some other way. Thank you for sharing the answer.

AmanSehgal
Honored Contributor III

You can do following:

  1. use your AWS Secret Keys and Access Key to mount an S3 bucket to DBFS.
  2. Create an instance profile and access via that
  3. Use KMS in S3 bucket and then use the same KMS to mount bucket to DBFS

Vidula
Honored Contributor

Hi @Karthikeyan Palanisamy​ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group