06-05-2017 03:23 AM
I have run the WordCount program and have saved the output into a directory as follows
counts.saveAsTextFile("/users/data/hobbit-out1")
subsequently I check that the output directory contains the expected number of files
%fs ls /users/data/hobbit-out1
and I see that three files exist
dbfs:/users/data/hobbit-out1/_SUCCESS_SUCCESSdbfs:/users/data/hobbit-out1/part-00000part-00000dbfs:/users/data/hobbit-out1/part-00001part-00001now I want to get the file dbfs:/users/data/hobbit-out1/part-00000 into my local computer.
i understand that to access these files i have to point my browser to an URL like
https://community.cloud.databricks.com/files/my-stuff/my-file.txt?o=######
in my notebook URL i note that the URL contains o=7892876048313913 and so the URL to my file should be
https://community.cloud.databricks.com/files/users/data/hobbit-out1/part-00000?o=7892876048313913
but this is leading me to 404 file not found error.
Can someone please tell me what is my error? Either in the approach or in the construction of the URL.
I have not yet tried the S3 route and but will try that if that is the ONLY way to get files out of dbfs.
Thanks for any help or guidance.
06-05-2017 08:14 AM
You'll have to use the FileStore.
https://docs.databricks.com/user-guide/advanced/filestore.html
Your mistake is that you didn't put it at this proper location. You put it in a different directory. You have to put it /FileStore/
06-05-2017 08:14 AM
You'll have to use the FileStore.
https://docs.databricks.com/user-guide/advanced/filestore.html
Your mistake is that you didn't put it at this proper location. You put it in a different directory. You have to put it /FileStore/
03-14-2019 08:46 PM
Actually, you do not have to put it in FileStore. You can use other folders like mnt as well. However if it is stored in the mnt folder, you will need something like this:
https://community.cloud.databricks.com/dbfs/mnt/blob/<file_name>.csv?o=<your_number_here>;Note that this will prompt you for your login and password, but once you enter this, the download should be seamless.03-18-2019 03:27 AM
Also simply CLI?
DBFS CLI
11-18-2020 02:32 PM
By far this seems the most straightforward result.
databricks fs cp <file_to_download> <local_filename>
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group