cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to download a file from dbfs to my local computer filesystem?

PrithwisMukerje
New Contributor II

I have run the WordCount program and have saved the output into a directory as follows

counts.saveAsTextFile("/users/data/hobbit-out1")

subsequently I check that the output directory contains the expected number of files

%fs ls /users/data/hobbit-out1

and I see that three files exist

dbfs:/users/data/hobbit-out1/_SUCCESS_SUCCESSdbfs:/users/data/hobbit-out1/part-00000part-00000dbfs:/users/data/hobbit-out1/part-00001part-00001

now I want to get the file dbfs:/users/data/hobbit-out1/part-00000 into my local computer.

i understand that to access these files i have to point my browser to an URL like

https://community.cloud.databricks.com/files/my-stuff/my-file.txt?o=######

in my notebook URL i note that the URL contains o=7892876048313913 and so the URL to my file should be

https://community.cloud.databricks.com/files/users/data/hobbit-out1/part-00000?o=7892876048313913

but this is leading me to 404 file not found error.

Can someone please tell me what is my error? Either in the approach or in the construction of the URL.

I have not yet tried the S3 route and but will try that if that is the ONLY way to get files out of dbfs.

Thanks for any help or guidance.

1 ACCEPTED SOLUTION

Accepted Solutions

Bill_Chambers
Contributor II

You'll have to use the FileStore.

https://docs.databricks.com/user-guide/advanced/filestore.html

Your mistake is that you didn't put it at this proper location. You put it in a different directory. You have to put it /FileStore/

View solution in original post

5 REPLIES 5

Bill_Chambers
Contributor II

You'll have to use the FileStore.

https://docs.databricks.com/user-guide/advanced/filestore.html

Your mistake is that you didn't put it at this proper location. You put it in a different directory. You have to put it /FileStore/

LiNKArsIdeni
New Contributor III

Actually, you do not have to put it in FileStore. You can use other folders like mnt as well. However if it is stored in the mnt folder, you will need something like this:

https://community.cloud.databricks.com/dbfs/mnt/blob/<file_name>.csv?o=<your_number_here>;

Note that this will prompt you for your login and password, but once you enter this, the download should be seamless.

Eve
New Contributor III

Also simply CLI?

DBFS CLI

demongolem
New Contributor II

By far this seems the most straightforward result.

databricks fs cp <file_to_download> <local_filename>

Kaniz_Fatma
Community Manager
Community Manager

@PrithwisMukerje , 

 

To download a file from DBFS to your local computer filesystem, you can use the Databricks CLI command databricks fs cp.
 
Here are the steps:
 
1. Open a terminal or command prompt on your local computer.
2. Run the following command to authenticate with your Databricks workspace:
 
   databricks configure --token
 
3. Follow the prompts to enter your Databricks workspace URL and personal access token.
4. Run the following command to download the file from DBFS to your local computer:
 
   databricks fs cp dbfs:/path/to/file /path/on/local/computer
 
  Replace /path/to/file with the path to the file in DBFS and /path/on/local/computer with the path where you want to save the file on your local computer.
 
Additional Resources
 
Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!