cancel
Showing results for 
Search instead for 
Did you mean: 
missing-QuestionPost
cancel
Showing results for 
Search instead for 
Did you mean: 

Discrepancies between official spark 3.3.2 and what's provided in Databricks Runtime 12.2 LTS leads to NoSuchMethodError when creating ParquetToSparkSchemaConverter

396827
New Contributor II

I am trying to run my spark-dependent application on a Databricks cluster:

  • spark-3.3.2
  • on a cluster with Databricks Runtime 12.2 LTS (spark-3.3.2, scala-2.12)

I end up with:

NoSuchMethodError: org.apache.spark.sql.execution.datasources.parquet.ParquetToSparkSchemaConverter$.$lessinit$greater$default$4()Z

It seems that the spark included in Databricks Runtime 12.2 LTS does not include the commit that changed the definition of ParquetToSparkSchemaConverter: https://issues.apache.org/jira/browse/SPARK-40819 (spark github: https://github.com/apache/spark/blame/a7bbaca013ad1ae92a437b12206fadfe93fea10f/sql/core/src/main/sca... . Anyway the difference can be spotted by comparing the release notes of both spark and Databricks Runtime:

Anyway, I investigated the case further and disassembled the class that comes with spark 3.3.2

```

Compiled from "ParquetSchemaConverter.scala"

public class org.apache.spark.sql.execution.datasources.parquet.ParquetToSparkSchemaConverter {

private final boolean assumeBinaryIsString;

private final boolean assumeInt96IsTimestamp;

private final boolean caseSensitive;

private final boolean nanosAsLong;

public static boolean $lessinit$greater$default$4();

Code:

0: getstatic #85 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;

3: invokevirtual #87 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$4:()Z

6: ireturn

public static boolean $lessinit$greater$default$3();

Code:

0: getstatic #85 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;

3: invokevirtual #90 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$3:()Z

6: ireturn

public static boolean $lessinit$greater$default$2();

Code:

0: getstatic #85 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;

3: invokevirtual #93 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$2:()Z

6: ireturn

public static boolean $lessinit$greater$default$1();

Code:

0: getstatic #85 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;

3: invokevirtual #96 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$1:()Z

6: ireturn

```

When check on the Databricks-provided jar:

```

Compiled from "ParquetSchemaConverter.scala"

public class org.apache.spark.sql.execution.datasources.parquet.ParquetToSparkSchemaConverter {

private final boolean assumeBinaryIsString;

private final boolean assumeInt96IsTimestamp;

private final boolean caseSensitive;

public static boolean $lessinit$greater$default$3();

Code:

0: getstatic #84 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;

3: invokevirtual #86 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$3:()Z

6: ireturn

public static boolean $lessinit$greater$default$2();

Code:

0: getstatic #84 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;

3: invokevirtual #89 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$2:()Z

6: ireturn

public static boolean $lessinit$greater$default$1();

Code:

0: getstatic #84 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;

3: invokevirtual #92 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$1:()Z

6: ireturn

```

So in fact `public static boolean $lessinit$greater$default$4()` is NOT present.

Can someone confirm that this change is in fact expected? How should that be handled

1 REPLY 1

OsamaNabih
New Contributor II

I ran into the same issue
Did you happen to find any fix?

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.