cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

%pip install requirements.txt - path not found

640913
New Contributor III

Hi everyone,

I was just testing things out to come up with a reasonable way of working with version management in DB and was inspired by the commands specified here. For my team and I, it makes no sense to put the requirements file in the dbfs location, as that will decouple version management from the actual code development that we do in git. Of course, the requirements file should be in the repository with the code and not in some dbfs folder which is not a git folder.

So I created a requirements.txt file in my git repo and pulled it to my Repo folder and tried doing a %pip install -r requirements.txt, but this is where I run into problems:

  • From my notebook "main" I can run my other notebook "notebook1" using the magic command `%run ./notebook1` and I get no errors.
  • From my notebook "main" I cannot run the magic command `%pip install -r ./requirements.txt` and databricks gives me the error that `ERROR: Could not open requirements file: [Errno 2] No such file or directory`.
    • I have tried writing the path to the requirements file in these different ways but with the same outcome:
      • ./requirements.txt
      • ./requirements
      • requirements.txt
      • requirements
      • Workspace/repos/user/repo-name/requirements.txt
      • Workspace/repos/user/repo-name/requirements
      • ./Workspace/repos/user/repo-name/requirements.txt

So my questions are:

  • Should I write the path another way or is it not possible to point to a .txt file in a repo folder?
    • If it is not possible, how does Databricks intend us to develop code and work with version management in a reasonable way?

My DBR is 7.3 LTS ML (includes Apache Spark 3.0.1, Scala 2.12).

Thank you!

2 REPLIES 2

UmaMahesh1
Honored Contributor III

Created a requirements.txt and pulled it into your repo folder ? Didn't get exactly this part..Maybe a screenshot should do for my understanding.

If you are not storing your TEXT file in any storage space, you can't do the above stuff you are trying to do. If you are storing that somewhere, you have to give the mount path or the dbfs path accordingly.

Uma Mahesh D

640913
New Contributor III

Yes let me explain this better.

I have a repository in git.

  • The contents of the repo is 3 files (all in the same folder, root):
    • "main" notebook,
    • "utils" notebook
    • requirements.txt.
  • The contents of the requirements.txt file specifies the environment needed for me to be able to run the code in my main and utils notebook.
  • Because the requirements.txt-file relates directly to the notebooks, I want it to be located close to them (and this is also the standard way of working with any python project).
  • When I clone the repo to databricks, I can run my "utils" notebook from my "main" notebook by the magic command %run ./utils
  • I cannot however run the magic command %pip install -r requirements

So I just learned that the path to use changes depending on the magic command, but none of the paths I specified in the original text works, but maybe I specified the path incorrectly.

I don't think I quite understand why I can run the utils notebook from the notebook, but not point to a .txt file in the same folder for the sake of version management.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group