cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Unittest in PySpark - how to read XML with Maven com.databricks.spark.xml ?

Michael_Galli
Contributor III

When writing unit tests with unittest / pytest in PySpark, reading mockup datasources with built-in datatypes like csv, json (spark.read.format("json")) works just fine.

But when reading XMLยดs with spark.read.format("com.databricks.spark.xml") in the unit test, this does not work out of the box:

java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml.

NOTE: the unit test do NOT run on a databricks cluster, but on a local hadoop winutils directory.

Is there any way to implement this, or should I use some python build-in xml libraries?

1 ACCEPTED SOLUTION

Accepted Solutions

This is correct.. the following worked for me:

SparkSession.builder.(..).config("spark.jars.packages", "com.databricks:spark-xml_2.12:0.12.0")

View solution in original post

4 REPLIES 4

-werners-
Esteemed Contributor III

I suppose you run spark locally? Because com.databricks.spark.xml is a library for spark.

It is not installed by default so you have to add it to your local spark install.

This is correct.. the following worked for me:

SparkSession.builder.(..).config("spark.jars.packages", "com.databricks:spark-xml_2.12:0.12.0")

Hubert-Dudek
Esteemed Contributor III

Please install spark-xml from Maven. As it is from Maven you need to install it for cluster which you are using in cluster settings (alternatively using API or CLI)

https://mvnrepository.com/artifact/com.databricks/spark-xml

See above, I already found the solution. There is no cluster, but only a local spark session.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now