cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Pull JAR from private Maven repository (Azure Artifactory)

Dom1
New Contributor III

Hi,

I currently struggle on the following task:

We want to push our code to a private repository (Azure Artifactory) and then pull it from databricks when the job runs. It currently works only with wheels inside a PyPi repo in the artifactory. I found some older comments that it is not supported to use a private maven repository but I was not able to find any documentation regarding this issue.

Can someone tell me if private maven repos are not supported and would be great to have something like an official source?

Thanks a lot

2 REPLIES 2

iyashk-DB
Databricks Employee
Databricks Employee

Databricks can install Maven libraries by coordinate and lets you point at a custom repository URL.

However, passing credentials for authenticated private Maven repositories directly through the Libraries UI/Jobs is not natively supported today and requires workarounds; this has been tracked internally as a product ask rather than a GA feature.

But one workaround for your private Maven host, which requires authentication, you can use Apache Ivy settings via init scripts to provide credentials and repository resolution, then let Ivy resolve packages at cluster startup. 

For this, you can create an ivysettings.xml file with credentials and point Spark to it; for newer runtimes, you can swap in a patched Ivy JAR to externalise the settings file for multiple repositories with authentication.

Dom1
New Contributor III

Thanks for your help and your response. I will try your workaround and come back to you ๐Ÿ™‚

I think a possible solution for us would also be that we push the artifacts into a databricks volume and then install the libraries from there. In this way we would not require the workaround but I struggle to understand what the best practises are for this case and how others solve this issue.