cancel
Showing results for 
Search instead for 
Did you mean: 
missing-QuestionPost
cancel
Showing results for 
Search instead for 
Did you mean: 

Discrepancies between official spark 3.3.2 and what's provided in Databricks Runtime 12.2 LTS leads to NoSuchMethodError when creating ParquetToSparkSchemaConverter

396827
New Contributor II

I am trying to run my spark-dependent application on a Databricks cluster:

  • spark-3.3.2
  • on a cluster with Databricks Runtime 12.2 LTS (spark-3.3.2, scala-2.12)

I end up with:

NoSuchMethodError: org.apache.spark.sql.execution.datasources.parquet.ParquetToSparkSchemaConverter$.$lessinit$greater$default$4()Z

It seems that the spark included in Databricks Runtime 12.2 LTS does not include the commit that changed the definition of ParquetToSparkSchemaConverter: https://issues.apache.org/jira/browse/SPARK-40819 (spark github: https://github.com/apache/spark/blame/a7bbaca013ad1ae92a437b12206fadfe93fea10f/sql/core/src/main/sca... . Anyway the difference can be spotted by comparing the release notes of both spark and Databricks Runtime:

Anyway, I investigated the case further and disassembled the class that comes with spark 3.3.2

```

Compiled from "ParquetSchemaConverter.scala"

public class org.apache.spark.sql.execution.datasources.parquet.ParquetToSparkSchemaConverter {

private final boolean assumeBinaryIsString;

private final boolean assumeInt96IsTimestamp;

private final boolean caseSensitive;

private final boolean nanosAsLong;

public static boolean $lessinit$greater$default$4();

Code:

0: getstatic #85 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;

3: invokevirtual #87 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$4:()Z

6: ireturn

public static boolean $lessinit$greater$default$3();

Code:

0: getstatic #85 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;

3: invokevirtual #90 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$3:()Z

6: ireturn

public static boolean $lessinit$greater$default$2();

Code:

0: getstatic #85 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;

3: invokevirtual #93 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$2:()Z

6: ireturn

public static boolean $lessinit$greater$default$1();

Code:

0: getstatic #85 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;

3: invokevirtual #96 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$1:()Z

6: ireturn

```

When check on the Databricks-provided jar:

```

Compiled from "ParquetSchemaConverter.scala"

public class org.apache.spark.sql.execution.datasources.parquet.ParquetToSparkSchemaConverter {

private final boolean assumeBinaryIsString;

private final boolean assumeInt96IsTimestamp;

private final boolean caseSensitive;

public static boolean $lessinit$greater$default$3();

Code:

0: getstatic #84 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;

3: invokevirtual #86 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$3:()Z

6: ireturn

public static boolean $lessinit$greater$default$2();

Code:

0: getstatic #84 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;

3: invokevirtual #89 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$2:()Z

6: ireturn

public static boolean $lessinit$greater$default$1();

Code:

0: getstatic #84 // Field org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.MODULE$:Lorg/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$;

3: invokevirtual #92 // Method org/apache/spark/sql/execution/datasources/parquet/ParquetToSparkSchemaConverter$.$lessinit$greater$default$1:()Z

6: ireturn

```

So in fact `public static boolean $lessinit$greater$default$4()` is NOT present.

Can someone confirm that this change is in fact expected? How should that be handled

1 REPLY 1

OsamaNabih
New Contributor II

I ran into the same issue
Did you happen to find any fix?

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group