cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

cluster and workflow issue

Ved88
New Contributor III

  com.databricks:spark-xml_2.12:0.18.0
 com.crealytics:spark-excel_2.12:3.4.3_0.20.4 in prerequisites_maven.yml and i created cluster and ran from this updated cluster notebook running but jobs failing 

 

UnknownException: (java.util.ServiceConfigurationError) org.apache.spark.sql.sources.DataSourceRegister: com.databricks.spark.xml.DefaultSource Unable to get public no-arg constructor

1 ACCEPTED SOLUTION

Accepted Solutions

bianca_unifeye
Databricks MVP

This is a classpath mismatch between the interactive cluster and the Workflow job cluster.

What I believe happened: the notebook was running on an all-purpose cluster with the Maven libraries attached, but the job was using a separate job cluster that did not have (or had conflicting versions of) spark-xml.

Fix

  • Add the Maven libraries directly to the Workflow job cluster (or job YAML), not just the interactive cluster:

    • com.databricks:spark-xml_2.12:0.18.0

    • com.crealytics:spark-excel_2.12:3.4.3_0.20.4

  • Ensure there are no duplicate or conflicting spark-xml jars

  • Confirm the same DBR/Spark version is used for both notebook and job

View solution in original post

2 REPLIES 2

bianca_unifeye
Databricks MVP

This is a classpath mismatch between the interactive cluster and the Workflow job cluster.

What I believe happened: the notebook was running on an all-purpose cluster with the Maven libraries attached, but the job was using a separate job cluster that did not have (or had conflicting versions of) spark-xml.

Fix

  • Add the Maven libraries directly to the Workflow job cluster (or job YAML), not just the interactive cluster:

    • com.databricks:spark-xml_2.12:0.18.0

    • com.crealytics:spark-excel_2.12:3.4.3_0.20.4

  • Ensure there are no duplicate or conflicting spark-xml jars

  • Confirm the same DBR/Spark version is used for both notebook and job

youssefmrini
Databricks Employee
Databricks Employee

You can now natively read Excel files https://docs.databricks.com/aws/en/query/formats/excel