I am trying to run my spark-dependent application on a Databricks cluster:
- spark-3.3.2
- on a cluster with Databricks Runtime 12.2 LTS (spark-3.3.2, scala-2.12)
I end up with:
NoSuchMethodError: org.apache.spark.sql.execution.datasources.parquet.ParquetToSparkSchemaConverter$.$lessinit$greater$default$4()Z
It seems that the spark included in Databricks Runtime 12.2 LTS does not include the commit that changed the definition of ParquetToSparkSchemaConverter: https://issues.apache.org/jira/browse/SPARK-40819 (spark github: https://github.com/apache/spark/blame/a7bbaca013ad1ae92a437b12206fadfe93fea10f/sql/core/src/main/sca... . Anyway the difference can be spotted by comparing the release notes of both spark and Databricks Runtime:
Anyway, I investigated the case further and disassembled the class that comes with spark 3.3.2
```
Compiled from "ParquetSchemaConverter.scala"
public class org.apache.spark.sql.execution.datasources.parquet.ParquetToSparkSchemaConverter {
private final boolean assumeBinaryIsString;
private final boolean assumeInt96IsTimestamp;
private final boolean caseSensitive;
private final boolean nanosAsLong;
public static boolean $lessinit$greater$default$4();
Code:
0: getstatic #85 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;
3: invokevirtual #87 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$4:()Z
6: ireturn
public static boolean $lessinit$greater$default$3();
Code:
0: getstatic #85 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;
3: invokevirtual #90 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$3:()Z
6: ireturn
public static boolean $lessinit$greater$default$2();
Code:
0: getstatic #85 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;
3: invokevirtual #93 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$2:()Z
6: ireturn
public static boolean $lessinit$greater$default$1();
Code:
0: getstatic #85 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;
3: invokevirtual #96 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$1:()Z
6: ireturn
```
When check on the Databricks-provided jar:
```
Compiled from "ParquetSchemaConverter.scala"
public class org.apache.spark.sql.execution.datasources.parquet.ParquetToSparkSchemaConverter {
private final boolean assumeBinaryIsString;
private final boolean assumeInt96IsTimestamp;
private final boolean caseSensitive;
public static boolean $lessinit$greater$default$3();
Code:
0: getstatic #84 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;
3: invokevirtual #86 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$3:()Z
6: ireturn
public static boolean $lessinit$greater$default$2();
Code:
0: getstatic #84 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;
3: invokevirtual #89 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$2:()Z
6: ireturn
public static boolean $lessinit$greater$default$1();
Code:
0: getstatic #84 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;
3: invokevirtual #92 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$1:()Z
6: ireturn
```
So in fact `public static boolean $lessinit$greater$default$4()` is NOT present.
Can someone confirm that this change is in fact expected? How should that be handled