- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-23-2021
11:40 PM
- last edited
3 weeks ago
by
Advika
I have two jars with the same class name. It works fine on yarn. When trying to run these jars on the Databricks cluster, I run into issues. Why Databricks is having this limitation?
- Labels:
-
Databricks Cluster
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-23-2021 11:52 PM
When you run the jobs in Yarn, those are 2 different applications getting submitted on Yarn. Hence each application will have a separate Spark driver JVM's.
In Databricks, a cluster has one JVM for the Spark driver. When applications with the same name are submitted on the same JVM, it's possible the classes are loaded from the incorrect jars.
Mitigations/Solution:
- Use an on-demand cluster for your jobs. This will ensure one jar uses a dedicated cluster.
- Change the class name in one of the classes to avoid conflict.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-23-2021 11:52 PM
When you run the jobs in Yarn, those are 2 different applications getting submitted on Yarn. Hence each application will have a separate Spark driver JVM's.
In Databricks, a cluster has one JVM for the Spark driver. When applications with the same name are submitted on the same JVM, it's possible the classes are loaded from the incorrect jars.
Mitigations/Solution:
- Use an on-demand cluster for your jobs. This will ensure one jar uses a dedicated cluster.
- Change the class name in one of the classes to avoid conflict.

