cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Download a dbfs:/FileStore File to my Local Machine?

IgnacioCastinei
New Contributor III

Hi all,

I am using saveAsTextFile() to store the results of a Spark job in the folder dbfs:/FileStore/my_result.

I can access to the different "part-xxxxx" files using the web browser, but I would like to automate the process of downloading all files to my local machine.

I have tried to use cURL, but I can't find the RestAPI command to download a dbfs:/FileStore file.

Question: How can I download a dbfs:/FileStore file to my Local Machine?

I am using Databricks Community Edition to teach an undergraduate module in Big Data Analytics in college. I have Windows 7 installed in my local machine. I have checked that cURL and the _netrc files are properly installed and configured as I manage to successfully run some of the commands provided by the RestAPI.

Thank you very much in advance for your help!

Best regards,

Nacho

1 ACCEPTED SOLUTION

Accepted Solutions

LiNKArsIdeni
New Contributor III

The answer by @tonypโ€‹  works well if the file is stored in FileStore. However if it is stored in the mnt folder, you will need something like this:

https://community.cloud.databricks.com/dbfs/mnt/blob/<file_name>.csv?o=<your_number_here>

Note that this will prompt you for your login and password, but once you enter this, the download should be seamless.

View solution in original post

9 REPLIES 9

tonyp
New Contributor II

Files stored in /FileStore are accessible in your web browser at https://<databricks-instance-name>.cloud.databricks.com/files/. For example, the file you stored in /FileStore/my-stuff/my-file.txt is accessible at:

"https://<databricks-instance-name>.cloud.databricks.com/files/my-stuff/my-file.txt"

Note If you are on Community Edition you may need to replace https://community.cloud.databricks.com/files/my-stuff/my-file.txt with https://community.cloud.databricks.com/files/my-stuff/my-file.txt?o=######where the number after o= is the same as in your Community Edition URL.

Refer: https://docs.databricks.com/user-guide/advanced/filestore.html

LiNKArsIdeni
New Contributor III

The answer by @tonypโ€‹  works well if the file is stored in FileStore. However if it is stored in the mnt folder, you will need something like this:

https://community.cloud.databricks.com/dbfs/mnt/blob/<file_name>.csv?o=<your_number_here>

Note that this will prompt you for your login and password, but once you enter this, the download should be seamless.

Eve
New Contributor III

or simply CLI?

DBFS CLI

Marc0
New Contributor II

For me, this does not work. I am trying to understand delta lake as a non tech user. I managed to create a community edition account and environment. Next, I followed the tutorial located here: https://docs.databricks.com/getting-started/quick-start.html

So I created the 'diamonds' table and so on. The only thing I want to do is to download the parquet and json files, just to see what's inside. I use the community edition, but the above does not work. Just no idea how to access the files. I tried the following (copied the right file path):

https://community.cloud.databricks.com/dbfs/mnt/delta/diamonds/_delta_log/00000000000000000000.json?...

(where ### is my community number indeed). But I receive a 401:

HTTP ERROR 401

Problem accessing /dbfs/mnt/delta/diamonds/_delta_log/00000000000000000000.json.

Reason: Unauthorized

How do I download the files? Cannot find it anywhere. Thanks for your help!

Atanu
Databricks Employee
Databricks Employee

It should be just auth issue , something with the permission. are you trying from CLI?

Anonymous
Not applicable

Hi! Welcome to the community and thank you for your question! My name is Piper, and I'm one of Databricks' moderators. We will give the community members a chance to respond. Then, if necessary, we'll circle back.

Thanks in advance for your patience.

Atanu
Databricks Employee
Databricks Employee

https://docs.databricks.com/dev-tools/cli/dbfs-cli.html liverage our DBFS CLI to download file.

Atanu
Databricks Employee
Databricks Employee

@Ignacio Castineirasโ€‹  are you able to look into above dbfs cli which may work with your case. Please let us know if you need further help on this. Thanks.

CraigJ
New Contributor II

works well if the file is stored in FileStore. However if it is stored in the mnt folder, you will need something like this:

https://community.cloud.databricks.com/dbfs/mnt/blob/<file_name>.csv?o=<your_number_here>

Note that this will prompt you for your login and password, but once you enter this, the download should be seamless.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group