09-15-2021 04:49 AM
Hi all,
So far I have been successfully using the CLI interface to upload files from my local machine to DBFS/FileStore/tables. Specifically, I have been using my terminal and the following command:
databricks fs cp -r <MyLocalDataset> dbfs:/FileStore/tables/NewDataset/
This last week the command does not seem to work anymore. When executing it verbosely it seems to run successfully (as the copy of each file is displayed in the terminal). Moreover, if later on I trigger the following command the NewDataset folder is listed:
databricks fs ls dbfs:/FileStore/tables/
However, when I check the content on Databricks => Data => Create New Table => DBFS => /FileStore/tables the folder NewDataset is not there.
Moreover, if I create a Notebook and try to load the NewDataset I get the following error:
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: /FileStore/tables/NewDataset
I have tried other commands of the CLI (as, for example, databricks clusters list) and they all work fine.
I am doing something wrong, or is there a new way for uploading files to DBFS I should be using instead?
I am using Databricks Community Edition.
Thank you very much for your time.
Kind regards,
Nacho
10-06-2021 10:25 AM
Hi Kaniz,
Thank you so much! Its been 3 weeks since I posted the question. If you can provide me with some guidance I will appreciate it.
Thank you very much in advance!
Kind regards,
Nacho
10-06-2021 03:33 PM
I tested this in my lab and found it working as expected.
Can you run the below command and see if it returns any files list?
databricks fs ls dbfs:/FileStore/tables/NewDataset
Also please check if a file already exists with the same name that of "NewDataset" in dbfs:/FileStore/tables
10-12-2021 01:30 AM
Hi @Arjun Kaimaparambil Rajan ,
Thank you for your answer. Yes, I think this is indeed the case.
I can see a mismatch between the content of the DBFS when:
In other words, the command "databricks fs cp -r <MyLocalDataset> dbfs:/FileStore/tables/NewDataset/" is uploading the dataset "somewhere", but not to the DBFS I can see with the GUI via Databricks => Data => Create New Table => DBFS => /FileStore/tables.
My question is:
Thank you very much in advance!
Kind regards,
Nacho
10-12-2021 02:03 AM
PS: I also checked the option --debug and looked for the header: x-databricks-org-id.
Both the ID of the CLI and the one appearing in the GUI are the same.
10-30-2021 07:35 AM
Hi @Arjun Kaimaparambil Rajan ,
Thank you for your reply.
Yes, I can confirm that the GUI option "Data" => "Create Table" => "Upload File" allows me to upload datasets from my local machine to DBFS.
Therefore, this can be used as an alternative to the CLI "databricks fs cp" command for uploading datasets from the local machine to DBSF /FileStore/tables/.
Two questions:
1. Would there be a similar GUI approach to download a result folder produced by a Spark Job back to the local machine?
I am aware that individual files in /FileStore/tables can be accessed via the URL, but this approach doesn't seem to work for an entire folder. Whereas the command "databricks fs ls" can be used to generate a script iterating the download of each file via "wget", that seems to be quite tedious.
2. More in general, would it be possible to add the CLI functionality "databricks fs cp" back to the Databricks Community Edition?
The CLI "databricks fs cp" command has been working all these years until recently. Perhaps it could be considered to bring this functionality back.
Personally, I use Databricks for teaching Spark in my university modules. Both my students and I like Databricks so much, and we would like to continue using it.
Kind regards,
Nacho
10-29-2021 03:40 PM
hi @Ignacio Castineiras ,
If Arjun.kr's fully answered your question, would you be happy to mark their answer as best so that others can quickly find the solution?
Please let us know if you still are having this issue.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group