Databricks Community

brickster_2018 · ‎06-23-2021

I have two jars with the same class name. It works fine on yarn. When trying to run these jars on the Databricks cluster, I run into issues. Why Databricks is having this limitation?

brickster_2018 · ‎06-23-2021

When you run the jobs in Yarn, those are 2 different applications getting submitted on Yarn. Hence each application will have a separate Spark driver JVM's.

In Databricks, a cluster has one JVM for the Spark driver. When applications with the same name are submitted on the same JVM, it's possible the classes are loaded from the incorrect jars.

Mitigations/Solution:

Use an on-demand cluster for your jobs. This will ensure one jar uses a dedicated cluster.
Change the class name in one of the classes to avoid conflict.

View solution in original post

brickster_2018 · ‎06-23-2021

When you run the jobs in Yarn, those are 2 different applications getting submitted on Yarn. Hence each application will have a separate Spark driver JVM's.

In Databricks, a cluster has one JVM for the Spark driver. When applications with the same name are submitted on the same JVM, it's possible the classes are loaded from the incorrect jars.

Mitigations/Solution:

Use an on-demand cluster for your jobs. This will ensure one jar uses a dedicated cluster.
Change the class name in one of the classes to avoid conflict.

Databricks Community

Unable to run 2 different applications with the same class name on a cluster

Photos

Join Us as a Local Community Builder!

Business Intelligence in the Era of AI

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Databricks Community Champion - March 2025 - Takuya Omi

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Virtual Learning Festival: 9 April - 30 April