Databricks Community

User16871418122 · ‎05-07-2021

I want to import a maven library with its dependencies. How to do it?

User16871418122 · ‎05-07-2021

I recommend creating a UBER jar or download jars offline use it in clusters when the maven becomes healthy again:

1. Install the MVN CLI tool on your local mac:

 brew install mvnvm

2. Download the Artifact with all dependencies:

mvn dependency:get -DrepoUrl=http://packages.confluent.io/maven/ -DgroupId=org.apache.kafka -DartifactId=kafka_2.11 -Dversion=0.10.0.0-cp1

mvn dependency:get -DrepoUrl=http://packages.confluent.io/maven/ -DgroupId=org.apache.kafka -DartifactId=kafka_2.11 -Dversion=0.10.0.0-cp1

3. Change directory into maven download repository and find the path of downloaded local path of repository:

cd $HOME/.m2

find ./ -name kafka-clients-0.10.0.0-cp1*

4. Copy the jar and dependencies to location /Users/user-gbth/mvn-lib-kafka/

mkdir /Users/user-gbth/mvn-lib-kafka/

cp /Users/user-gbth/.m2/repository/org/apache/kafka/kafka-clients/0.10.0.0-cp1/kafka-clients-0.10.0.0-cp1.jar /Users/gobinath/mvn-lib-kafka/

mvn dependency:copy-dependencies -f /Users/user-gbth/.m2/repository/org/apache/kafka/kafka-clients/0.10.0.0-cp1/kafka-clients-0.10.0.0-cp1.pom -DoutputDirectory=/Users/user-gbth/mvn-lib-kafka/

mvn dependency:copy-dependencies -f /Users/user-gbth/.m2/repository/org/apache/kafka/kafka-clients/0.10.0.0-cp1/kafka-clients-0.10.0.0-cp1.pom -DoutputDirectory=/Users/user-gbth/mvn-lib-kafka/

5. Zip the file and import the zip file into Databricks.

zip -r kafka-clients-0.10.0.0-cp1.zip /Users/gobinath/mvn-lib-kafka/

You can upload via UI or upload to the s3 bucket and load into databricks.

6) Load jars in databricks:

a) extract the zip to /databricks/jar location using databricks init script:

#!/bin/bash
unzip /dbfs/path/kafka-clients-0.10.0.0-cp1.zip –d /databricks/jars/

b) or wrap it as UBER jar and attach to the library dependency list of the cluster.

https://maven.apache.org/plugins/maven-shade-plugin/examples/includes-excludes.html

View solution in original post

User16871418122 · ‎05-07-2021

I recommend creating a UBER jar or download jars offline use it in clusters when the maven becomes healthy again:

1. Install the MVN CLI tool on your local mac:

 brew install mvnvm

2. Download the Artifact with all dependencies:

mvn dependency:get -DrepoUrl=http://packages.confluent.io/maven/ -DgroupId=org.apache.kafka -DartifactId=kafka_2.11 -Dversion=0.10.0.0-cp1

mvn dependency:get -DrepoUrl=http://packages.confluent.io/maven/ -DgroupId=org.apache.kafka -DartifactId=kafka_2.11 -Dversion=0.10.0.0-cp1

3. Change directory into maven download repository and find the path of downloaded local path of repository:

cd $HOME/.m2

find ./ -name kafka-clients-0.10.0.0-cp1*

4. Copy the jar and dependencies to location /Users/user-gbth/mvn-lib-kafka/

mkdir /Users/user-gbth/mvn-lib-kafka/

cp /Users/user-gbth/.m2/repository/org/apache/kafka/kafka-clients/0.10.0.0-cp1/kafka-clients-0.10.0.0-cp1.jar /Users/gobinath/mvn-lib-kafka/

mvn dependency:copy-dependencies -f /Users/user-gbth/.m2/repository/org/apache/kafka/kafka-clients/0.10.0.0-cp1/kafka-clients-0.10.0.0-cp1.pom -DoutputDirectory=/Users/user-gbth/mvn-lib-kafka/

mvn dependency:copy-dependencies -f /Users/user-gbth/.m2/repository/org/apache/kafka/kafka-clients/0.10.0.0-cp1/kafka-clients-0.10.0.0-cp1.pom -DoutputDirectory=/Users/user-gbth/mvn-lib-kafka/

5. Zip the file and import the zip file into Databricks.

zip -r kafka-clients-0.10.0.0-cp1.zip /Users/gobinath/mvn-lib-kafka/

You can upload via UI or upload to the s3 bucket and load into databricks.

6) Load jars in databricks:

a) extract the zip to /databricks/jar location using databricks init script:

#!/bin/bash
unzip /dbfs/path/kafka-clients-0.10.0.0-cp1.zip –d /databricks/jars/

b) or wrap it as UBER jar and attach to the library dependency list of the cluster.

https://maven.apache.org/plugins/maven-shade-plugin/examples/includes-excludes.html

Databricks Community

How do I download maven libraries with dependencies?

Photos

Join Us as a Local Community Builder!

Virtual Learning Festival: 9 April - 30 April

Intelligent Data Warehousing: AI/BI for Self-service Analytics

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Data + AI Summit 2025 — registration now open!

Databricks Community Champion - March 2025 - Takuya Omi