cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks and AWS CodeArtifact

axelboursin
New Contributor II

Hello, I saw multiple topics about it, but I need explanations and a solution.

In my context, we have developers that are developing Python projects, like X.
In Databricks, we have a cluster with a library of the main project A that is dependent of X.

pyproject.toml is like :
[tool.poetry.dependencies]
X = { source = "codeartifact", version = "0.1.0" }

[[tool.poetry.source]]
name = "codeartifact"
url = "https://domain-ownerid.d.codeartifact.region-name.amazonaws.com/pypi/repo/simple/"
priority = "supplemental"

Cluster is searching on public PyPi repositories.

Thanks for your answers and your help!

2 REPLIES 2

axelboursin
New Contributor II

I saw that solution may be in the init script but it's not really essy to work with.

I mean, there's no log generated from the bash script, so this is not an easy way to solve my problem, maybe you have some advices about it?

stbjelcevic
Databricks Employee
Databricks Employee

Hi @axelboursin ,

I think this article will help you out: https://docs.databricks.com/aws/en/admin/workspace-settings/default-python-packages (option 1 below).

Recommended approaches (choose based on your environment):

  • For broad, consistent behavior across clusters and notebooks, configure workspace-level default Python package repositories to point to CodeArtifact; this avoids per-notebook tokens and works for both serverless and classic compute.

  • (The thing you mentioned) For classic clusters only, add a cluster-scoped init script that runs aws codeartifact login and writes pip config, so pip automatically resolves from your CodeArtifact repo at cluster start.

  • For one-off installs or testing, use notebook-scoped %pip with --index-url (and --extra-index-url as needed) and credentials pulled from Databricks Secrets inside the notebook.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now