cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Issue: NoSuchMethodError in Spark Job While Upgrading to Databricks 15.5 LTS

sahil_s_jain
New Contributor III

Problem Description

I am attempting to upgrade my application from Databricks runtime version 12.2 LTS to 15.5 LTS. During this upgrade, my Spark job fails with the following error:

java.lang.NoSuchMethodError: org.apache.spark.scheduler.SparkListenerApplicationEnd.<init>(J)V

Root Cause Analysis

  • Spark Version in Databricks 15.5 LTS: The runtime includes Apache Spark 3.5.x, which defines the SparkListenerApplicationEnd constructor as:

    public SparkListenerApplicationEnd(long time)

    This constructor takes a single long parameter.

  • Conflicting Spark Library in Databricks: The error arises due to a conflicting library: ----ws_3_5--core--core-hive-2.3__hadoop-3.2_2.12_deploy.jar. This library includes a different version of the SparkListenerApplicationEnd class, which defines the constructor as:

    public SparkListenerApplicationEnd(long time, scala.Option<Object> exitCode)

    This is method is present Spark 4.0.0-preview2 version.

  • Impact: At runtime, the JVM attempts to use the single-parameter constructor (<init>(J)V) but fails because the conflicting library expects the two-parameter version. This mismatch leads to the NoSuchMethodError.

Thank you in advance for your support!

7 REPLIES 7

Walter_C
Databricks Employee
Databricks Employee

The error you are encountering seems to be related to the runtime includes Apache Spark 3.5.x, which defines the constructor as public SparkListenerApplicationEnd(long time), while a conflicting library (----ws_3_5--core--core-hive-2.3__hadoop-3.2_2.12_deploy.jar) expects a different version of the constructor: public SparkListenerApplicationEnd(long time, scala.Option<Object> exitCode).

 

This issue occurs because the JVM attempts to use the single-parameter constructor but fails due to the conflicting library expecting the two-parameter version, leading to the NoSuchMethodError.

To resolve this issue, you can try the following steps:

  1. Identify and Remove Conflicting Libraries: Check your dependencies and remove or update the conflicting library that includes the SparkListenerApplicationEnd class with the two-parameter constructor. Ensure that all libraries are compatible with Apache Spark 3.5.x.

  2. Update Dependencies: Ensure that all your project dependencies are updated to versions compatible with Databricks runtime 15.5 LTS and Apache Spark 3.5.x.

sahil_s_jain
New Contributor III

The issue is because Databricks 15.4 LTS includes ws_3_5--core--core-hive-2.3__hadoop-3.2_2.12_deploy.jar library which is not compatible with Spark 3.5.x version. Spark 3.5.x version contains single argument SparkListenerApplicationEnd constructor.

 

Databrick 15.4 LTS includes Spark 3.5.x and ws_3_5--core--core-hive-2.3__hadoop-3.2_2.12_deploy.jar library should be compatible with Spark 3.5.x. But it is not.

sahil_s_jain
New Contributor III

I am trying to initialize class org.apache.spark.scheduler.SparkListenerApplicationEnd with databricks 15.4LTS.

Spark 3.5.0 expects a single argument constructor for org.apache.spark.scheduler.SparkListenerApplicationEnd(long time)

Whereas the class packaged in Databricks jar "----ws_3_5--core--core-hive-2.3__hadoop-3.2_2.12_deploy.jar" expects a 2 argument constructor i.e.

org.apache.spark.scheduler.SparkListenerApplicationEnd(long time, scala.Option<Object> exitCode)

This 2 argument constructor is in line with Spark 4.0.0-preview2 version and NOT IN spark version 3.5.0

This is causing a conflict, can you please check this version issue in the Databricks cluster binaries.

DBonomo
New Contributor II

I can attest to this being the case as well. I ran into this issue trying to implement and updated form of the 

com.microsoft.sqlserver.jdbc.spark connector, and found that the implementation in DBR 15.4LTS is actually mapped to master (the current spark 4.0 working branch). 

DBonomo_0-1736264319353.pngDBonomo_1-1736264341599.png

You can reference the 3.5 implementation here compared to the master branch version here.




sahil_s_jain
New Contributor III

@DBonomo , did you find any workaround for this?

DBonomo
New Contributor II

No I am currently downgrading to an older DBR (13.3) and running these jobs specifically on that version. That brings it's own suite of problems though.

ameerafi
Databricks Employee
Databricks Employee

@DBonomo @sahil_s_jain We can write separate getSchema method inside the BulkCopyUtils.scala file and call that method instead of referring it from spark. You can add the below function in the BulkCopyUtils.scala file and build it locally. You can then call this function --> `val tableCols = BulkCopyJdbcUtils.getSchema(rs, JdbcDialects.get(url))`

import org.apache.spark.sql.jdbc.JdbcDialect


/**
* Utility object containing getSchema implementation for Spark 3.5
* This replaces the call to org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils.getSchema
* to avoid method signature conflicts in DBR 15.4
*/
object BulkCopyJdbcUtils {

/**
* Takes a [[ResultSet]] and returns its Catalyst schema.
* This is the Spark 3.5 version with 3 parameters.
*
* @param resultSet The ResultSet to extract schema from
* @param dialect The JDBC dialect to use for type mapping
* @param alwaysNullable If true, all the columns are nullable.
* @return A [[StructType]] giving the Catalyst schema.
* @throws SQLException if the schema contains an unsupported type.
*/
def getSchema(
resultSet: ResultSet,
dialect: JdbcDialect,
alwaysNullable: Boolean = false): StructType = {
val rsmd = resultSet.getMetaData
val ncols = rsmd.getColumnCount
val fields = new Array[StructField](ncols)
var i = 0
while (i < ncols) {
val columnName = rsmd.getColumnLabel(i + 1)
val dataType = rsmd.getColumnType(i + 1)
val typeName = rsmd.getColumnTypeName(i + 1)
val fieldSize = rsmd.getPrecision(i + 1)
val fieldScale = rsmd.getScale(i + 1)
val isSigned = {
try {
rsmd.isSigned(i + 1)
} catch {
// Workaround for HIVE-14684:
case e: SQLException if
e.getMessage == "Method not supported" &&
rsmd.getClass.getName == "org.apache.hive.jdbc.HiveResultSetMetaData" => true
}
}
val nullable = if (alwaysNullable) {
true
} else {
rsmd.isNullable(i + 1) != ResultSetMetaData.columnNoNulls
}
val metadata = new MetadataBuilder()
.putString("name", columnName)
.putLong("scale", fieldScale)
.build()

val columnType = getCatalystType(dataType, typeName, fieldSize, fieldScale, isSigned)
fields(i) = StructField(columnName, columnType, nullable, metadata)
i = i + 1
}
new StructType(fields)
}

/**
* Maps a JDBC type to a Catalyst type using Spark 3.5 logic.
* Fixed DecimalType.bounded compatibility issue.
*/
private def getCatalystType(
sqlType: Int,
typeName: String,
precision: Int,
scale: Int,
signed: Boolean): DataType = {

val answer = sqlType match {
// scalastyle:off
case java.sql.Types.ARRAY => null
case java.sql.Types.BIGINT => if (signed) { LongType } else { DecimalType(20,0) }
case java.sql.Types.BINARY => BinaryType
case java.sql.Types.BIT => BooleanType // @see JdbcDialect for quirks
case java.sql.Types.BLOB => BinaryType
case java.sql.Types.BOOLEAN => BooleanType
case java.sql.Types.CHAR => StringType
case java.sql.Types.CLOB => StringType
case java.sql.Types.DATALINK => null
case java.sql.Types.DATE => DateType
case java.sql.Types.DECIMAL
if precision != 0 || scale != 0 => createDecimalType(precision, scale)
case java.sql.Types.DECIMAL => DecimalType.SYSTEM_DEFAULT
case java.sql.Types.DISTINCT => null
case java.sql.Types.DOUBLE => DoubleType
case java.sql.Types.FLOAT => FloatType
case java.sql.Types.INTEGER => if (signed) { IntegerType } else { LongType }
case java.sql.Types.JAVA_OBJECT => null
case java.sql.Types.LONGNVARCHAR => StringType
case java.sql.Types.LONGVARBINARY => BinaryType
case java.sql.Types.LONGVARCHAR => StringType
case java.sql.Types.NCHAR => StringType
case java.sql.Types.NCLOB => StringType
case java.sql.Types.NULL => NullType
case java.sql.Types.NUMERIC
if precision != 0 || scale != 0 => createDecimalType(precision, scale)
case java.sql.Types.NUMERIC => DecimalType.SYSTEM_DEFAULT
case java.sql.Types.NVARCHAR => StringType
case java.sql.Types.OTHER => null
case java.sql.Types.REAL => DoubleType
case java.sql.Types.REF => StringType
case java.sql.Types.REF_CURSOR => null
case java.sql.Types.ROWID => LongType
case java.sql.Types.SMALLINT => IntegerType
case java.sql.Types.SQLXML => StringType
case java.sql.Types.STRUCT => StringType
case java.sql.Types.TIME => TimestampType
case java.sql.Types.TIME_WITH_TIMEZONE => null
case java.sql.Types.TIMESTAMP => TimestampType
case java.sql.Types.TIMESTAMP_WITH_TIMEZONE => null
case java.sql.Types.TINYINT => IntegerType
case java.sql.Types.VARBINARY => BinaryType
case java.sql.Types.VARCHAR => StringType
case _ =>
throw new SQLException("Unrecognized SQL type " + sqlType)
// scalastyle:on
}

if (answer == null) {
throw new SQLException("Unsupported type " + sqlType)
}
answer
}

/**
* Helper method to create DecimalType with proper bounds checking
* This replaces DecimalType.bounded which may not be accessible
*/
private def createDecimalType(precision: Int, scale: Int): DecimalType = {
// Ensure precision and scale are within valid bounds
val validPrecision = math.min(math.max(precision, 1), DecimalType.MAX_PRECISION)
val validScale = math.min(math.max(scale, 0), validPrecision)

try {
// Try the standard constructor first
DecimalType(validPrecision, validScale)
} catch {
case _: Exception =>
// Fallback to system default if constructor fails
DecimalType.SYSTEM_DEFAULT
}
}
}

 

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now