- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-27-2018 03:15 PM
Hi all,
I am using saveAsTextFile() to store the results of a Spark job in the folder dbfs:/FileStore/my_result.
I can access to the different "part-xxxxx" files using the web browser, but I would like to automate the process of downloading all files to my local machine.
I have tried to use cURL, but I can't find the RestAPI command to download a dbfs:/FileStore file.
Question: How can I download a dbfs:/FileStore file to my Local Machine?
I am using Databricks Community Edition to teach an undergraduate module in Big Data Analytics in college. I have Windows 7 installed in my local machine. I have checked that cURL and the _netrc files are properly installed and configured as I manage to successfully run some of the commands provided by the RestAPI.
Thank you very much in advance for your help!
Best regards,
Nacho
- Labels:
-
Command execution
-
DBFS
-
Rest-api
-
Where
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-14-2019 08:42 PM
The answer by @tonyp works well if the file is stored in FileStore. However if it is stored in the mnt folder, you will need something like this:
https://community.cloud.databricks.com/dbfs/mnt/blob/<file_name>.csv?o=<your_number_here>
Note that this will prompt you for your login and password, but once you enter this, the download should be seamless.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
02-12-2019 12:29 AM
Files stored in /FileStore are accessible in your web browser at https://<databricks-instance-name>.cloud.databricks.com/files/. For example, the file you stored in /FileStore/my-stuff/my-file.txt is accessible at:
"https://<databricks-instance-name>.cloud.databricks.com/files/my-stuff/my-file.txt"
Note If you are on Community Edition you may need to replace https://community.cloud.databricks.com/files/my-stuff/my-file.txt with https://community.cloud.databricks.com/files/my-stuff/my-file.txt?o=######where the number after o= is the same as in your Community Edition URL.
Refer: https://docs.databricks.com/user-guide/advanced/filestore.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-14-2019 08:42 PM
The answer by @tonyp works well if the file is stored in FileStore. However if it is stored in the mnt folder, you will need something like this:
https://community.cloud.databricks.com/dbfs/mnt/blob/<file_name>.csv?o=<your_number_here>
Note that this will prompt you for your login and password, but once you enter this, the download should be seamless.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-18-2019 03:28 AM
or simply CLI?
DBFS CLI
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-20-2022 05:35 AM
For me, this does not work. I am trying to understand delta lake as a non tech user. I managed to create a community edition account and environment. Next, I followed the tutorial located here: https://docs.databricks.com/getting-started/quick-start.html
So I created the 'diamonds' table and so on. The only thing I want to do is to download the parquet and json files, just to see what's inside. I use the community edition, but the above does not work. Just no idea how to access the files. I tried the following (copied the right file path):
(where ### is my community number indeed). But I receive a 401:
HTTP ERROR 401
Problem accessing /dbfs/mnt/delta/diamonds/_delta_log/00000000000000000000.json.
Reason: Unauthorized
How do I download the files? Cannot find it anywhere. Thanks for your help!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-25-2022 08:36 PM
It should be just auth issue , something with the permission. are you trying from CLI?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
01-20-2022 08:43 AM
Hi! Welcome to the community and thank you for your question! My name is Piper, and I'm one of Databricks' moderators. We will give the community members a chance to respond. Then, if necessary, we'll circle back.
Thanks in advance for your patience.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-08-2022 06:08 AM
https://docs.databricks.com/dev-tools/cli/dbfs-cli.html liverage our DBFS CLI to download file.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-15-2022 09:57 PM
@Ignacio Castineiras are you able to look into above dbfs cli which may work with your case. Please let us know if you need further help on this. Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-25-2022 12:57 AM
works well if the file is stored in FileStore. However if it is stored in the mnt folder, you will need something like this:
https://community.cloud.databricks.com/dbfs/mnt/blob/<file_name>.csv?o=<your_number_here>
Note that this will prompt you for your login and password, but once you enter this, the download should be seamless.