cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks Repo

dbsnoo
New Contributor II

Hello,

What is the correct way to install packages from requierements.txt within databricks repo. Do I need to add some utils notebooks with additional scripts to my repo and run them before any of the script from the file? I suppose adding pip install on every file is a bit extreme so what is correct approach to this?

Would greatly appreciate any help, since it is my first time working with databricks throught repo

1 ACCEPTED SOLUTION

Accepted Solutions

-werners-
Esteemed Contributor III

The .py files are very handy if you create classes etc.  They contain modules which you can import into a notebook with the import statement.  They are not meant to be run.

 

View solution in original post

3 REPLIES 3

-werners-
Esteemed Contributor III

Define an environment in a requirements.txt file in the repo. Then just run pip install -r requirements.txt from a notebook to install the packages and create the environment for the notebook.

Using Repos is practically the same as the Workspace, except it is linked to git (so you need to commit/push/pull) and the paths are different.

dbsnoo
New Contributor II

Appreciate your response, 

pip install -r requirements.txt worked when I created new notebook and run some code there, but not for the files in the repo. When I try to run '*.py' file in my repo through databricks run command, I get 'ModuleNotFoundError'. Maybe I am just misunderstanding the concept here, and you are not suppose to run those files directly in databricks and if you do, it is better to have them as notebook files as opposed to '.py.'

As a side note, I was reading on global init script and wondering if that would be a way to run my files within databricks. 

Maybe someone could point me to some information (docs, video or anything) about working in databricks repo that goes beyond integration

-werners-
Esteemed Contributor III

The .py files are very handy if you create classes etc.  They contain modules which you can import into a notebook with the import statement.  They are not meant to be run.

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group