cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

mount s3 bucket with specific endpoint

impulsleistung
New Contributor III

Environment:

  • AZURE-Databricks
  • Language: Python

I can access my s3 bucket via:

boto3.client('s3', endpoint_url='https://gateway.storjshare.io', ... )

and it also works via:

boto3.resource('s3', endpoint_url='https://gateway.storjshare.io', ... )

As a next step, I want to mount this S3 with the specific endpoint in AZURE-Databricks, but there is not even an option for that.

How do I have to write the mount routine in the notebook?

5 REPLIES 5

Hubert-Dudek
Esteemed Contributor III

In AWS Console, in "My security credentials," please generate a new access key and secret key,

Set them as env variables:

sc._jsc.hadoopConfiguration().set("fs.s3n.awsAccessKeyId", ACCESS_KEY)
sc._jsc.hadoopConfiguration().set("fs.s3n.awsSecretAccessKey", SECRET_KEY)

Now you can read files from your S3 bucket directly

 df = spark.read.csv("https://gateway.storjshare.io/test.csv”", header=True, inferSchema=True)

you can as well mount a bucket permanently using that command

dbutils.fs.mount(f"s3a://{ACCESS_KEY}:{SECRET_KEY}@{aws_bucket_name}", f"/mnt/{mount_name}")

It is safer to use a key vault to store your access key and secret key

This won't work. I'm using AZURE-Databricks and I want to read/write objects from/to an S3 bucket with a specific endpoint → endpoint_url='https://gateway.storjshare.io'

So this is not a I/O operation from Databricks to AWS. In addition, this is actually important because the Azure-Datafactory only support reading and NOT writing back. So far, there's no user-friendly way to do so.

Kaniz
Community Manager
Community Manager

Hi @Kevin Ostheimer​ ​, We haven’t heard from you since the last response from @Hubert Dudek​, and I was checking back to see if you have a resolution yet.

If you have any solution, please share it with the community as it can be helpful to others. Otherwise, we will respond with more details and try to help.

Also, Please don't forget to click on the "Select As Best" button whenever the information provided helps resolve your question.

impulsleistung
New Contributor III

Hi! I just tried, I'm on AZURE and the endpoint is proprietary, s. my reply

Anonymous
Not applicable

Hi @Kevin Ostheimer​ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!