cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Deploy tar.gz package from private git hub

thushar
Contributor

We created Python package (.tar.gz) and kept it under private git.

We can able to connect to that git (using PAT) from the Azure databricks notebook.

Our requirement is to install that package from .tar.gz file for that notebook

"pip install https://USERNAME:PASWWORD@github.com/company_github/my_repo/my_package.tar.gz";

getting the below error.

does not appear to be a Python project: neither 'setup.py' nor 'pyproject.toml' found.

1 ACCEPTED SOLUTION

Accepted Solutions

Rahul_Samant
Contributor

For installing the package using pip you need to package the repo using setup.py. check this link for more details https://packaging.python.org/en/latest/tutorials/packaging-projects/

alternatively you can pass the tar.gz using --py-files while submitting job.

View solution in original post

4 REPLIES 4

Anonymous
Not applicable

Hello @Thushar R​ - My name is Piper, and I'm a moderator for Databricks. It's great to have you here and thank you for your question. 🙂

Let's give the community a while to answer before we circle back around to this.

Rahul_Samant
Contributor

For installing the package using pip you need to package the repo using setup.py. check this link for more details https://packaging.python.org/en/latest/tutorials/packaging-projects/

alternatively you can pass the tar.gz using --py-files while submitting job.

Thanks, but why do I need to build again the package?

I am using VS Code for development, through VS Code I built my package and generated tar.gz and wheel file and committed in Git. So the aim is to install the already built package from Git to Azure databricks notebook.

What I understood from your reply is, even the package exists, we have to build it again for installing, so we need the setup.py file, right?

Atanu
Databricks Employee
Databricks Employee

If i understood correctly , yes that's correct you need to use the py file. is Rahul's solution worked @Thushar R​ ?

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now