cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Does Databricks have a maven repository to download the jars?

brickster_2018
Esteemed Contributor
Esteemed Contributor

Using OSS jars is causing classpath issues always when running the job on Databricks. The same job works fine on EMR/on-premise. 

1 ACCEPTED SOLUTION

Accepted Solutions

brickster_2018
Esteemed Contributor
Esteemed Contributor

Databricks does not host the jars in its own maven repository. The jars from OSS can be used to compile the application. The OSS jars should not be used at the execution time. ie: the application jars should not be fat jars, but rather thin jars.

If internal Spark APIs are used, then it's possible the Databricks version of those classes has a different method, and compiling your application against the OSS jars causes Classpath issues. In such scenarios the jars available on the cluster should be sourced from your local and used for compilation. The jars are available at :

https://docs.databricks.com/dev-tools/databricks-connect.html#intellij-scala-or-java

View solution in original post

2 REPLIES 2

brickster_2018
Esteemed Contributor
Esteemed Contributor

Databricks does not host the jars in its own maven repository. The jars from OSS can be used to compile the application. The OSS jars should not be used at the execution time. ie: the application jars should not be fat jars, but rather thin jars.

If internal Spark APIs are used, then it's possible the Databricks version of those classes has a different method, and compiling your application against the OSS jars causes Classpath issues. In such scenarios the jars available on the cluster should be sourced from your local and used for compilation. The jars are available at :

https://docs.databricks.com/dev-tools/databricks-connect.html#intellij-scala-or-java

mj2022
New Contributor III

I following the https://docs.databricks.com/dev-tools/databricks-connect.html#intellij-scala-or-java to obtain spark-avro jar since databricks have it's custom from_avro method to use with kafka schema registry, But i am not able to find spark-avro jar using databricks connect ..We need that jar to be used with compilation. Any ideas why spark-avro jar is missing when we get jars using "databricks-connect get-jar-dir"?

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!