cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Deploy tar.gz package from private git hub

thushar
Contributor

We created Python package (.tar.gz) and kept it under private git.

We can able to connect to that git (using PAT) from the Azure databricks notebook.

Our requirement is to install that package from .tar.gz file for that notebook

"pip install https://USERNAME:PASWWORD@github.com/company_github/my_repo/my_package.tar.gz";

getting the below error.

does not appear to be a Python project: neither 'setup.py' nor 'pyproject.toml' found.

1 ACCEPTED SOLUTION

Accepted Solutions

Rahul_Samant
Contributor

For installing the package using pip you need to package the repo using setup.py. check this link for more details https://packaging.python.org/en/latest/tutorials/packaging-projects/

alternatively you can pass the tar.gz using --py-files while submitting job.

View solution in original post

4 REPLIES 4

Anonymous
Not applicable

Hello @Thushar R​ - My name is Piper, and I'm a moderator for Databricks. It's great to have you here and thank you for your question. 🙂

Let's give the community a while to answer before we circle back around to this.

Rahul_Samant
Contributor

For installing the package using pip you need to package the repo using setup.py. check this link for more details https://packaging.python.org/en/latest/tutorials/packaging-projects/

alternatively you can pass the tar.gz using --py-files while submitting job.

Thanks, but why do I need to build again the package?

I am using VS Code for development, through VS Code I built my package and generated tar.gz and wheel file and committed in Git. So the aim is to install the already built package from Git to Azure databricks notebook.

What I understood from your reply is, even the package exists, we have to build it again for installing, so we need the setup.py file, right?

Atanu
Esteemed Contributor

If i understood correctly , yes that's correct you need to use the py file. is Rahul's solution worked @Thushar R​ ?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group