cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

medium.datadriveninvestor.com

THIAM_HUATTAN
Valued Contributor

say, I want to download 2 files from this directory (dbfs:/databricks-datasets/abc-quality/") to my local filesystem, how do I do it?

I understand that if those files are inside FileStore directory, it is much straightforward, which someone posts some solution here:

https://medium.datadriveninvestor.com/how-to-download-a-file-from-databricks-filestore-to-a-local-ma...

Hence, now I am trying to see if it is possible to do a copy files from directory dbfs:/databricks-datasets/abc-quality/

to

directory in FileStore, which I tried below:

dbutils.fs.cp ("dbfs:/FileStore/", "dbfs:/databricks-datasets/abc-quality/", recurse = True)

but gives me error message:

java.rmi.RemoteException: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied; request: PUT https://databricks-datasets-oregon.s3.us-west-2.amazonaws.com quality/Bonus_jan.csv {} aws-sdk-java/1.12.101 Linux/5.4.0-1088-aws OpenJDK_64-Bit_Server_VM/25.342-b07 java/1.8.0_342 scala/2.12.14 kotlin/1.4.0 vendor/Private_Build cfg/retry-mode/legacy com.amazonaws.services.s3.transfer.TransferManager/1.12.101 com.amazonaws.services.s3.model.PutObjectRequest; Request ID: JN9487JSJFXDPAY8, Extended Request ID: zkWIIgPXRte6UWeXv5BzC/IzHXhqOntjdCyGiBw34+mm3qMi1irFm9jfY3/iwEfbt/0Ywz4TsKw=, Cloud Provider: AWS, Instance ID: i-0f553097120e72dd4 (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: JN9487JSJFXDPAY8; S3 Extended Request ID: zkWIIgPXRte6UWeXv5BzC/IzHXhqOntjdCyGiBw34+mm3qMi1irFm9jfY3/iwEfbt/0Ywz4TsKw=; Proxy: null), S3 Extended Request ID: zkWIIgPXRte6UWeXv5BzC/IzHXhqOntjdCyGiBw34+mm3qMi1irFm9jfY3/iwEfbt/0Ywz4TsKw=; nested exception is:

3 REPLIES 3

Pat
Honored Contributor III

Hi @THIAM HUAT TAN​ ,

isn't this dbfs://databricks-datasets Databricks owned s3:// mounted to the workspace?

You got an error - 403 access denied to PUT files into the s3 bucket: https://databricks-datasets-oregon.s3.us-west-2.amazonaws.com

you should use some s3 bucket that you own and you can add files or if the file is small , you might be able to use dbfs://tmp/

thanks,

Pat.

THIAM_HUATTAN
Valued Contributor

I got the order of source and destination reversed. Now, it works, thanks 👍

Awesome @THIAM HUAT TAN​ and @Pat Sienkiewicz​!

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.