Getting "java.lang.ClassNotFoundException: Failed to find data source: xml" error when loading XML
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-03-2018 10:35 AM
Both the following commands fail
df1 = sqlContext.read.format("xml").load(loadPath)
df2 = sqlContext.read.format("com.databricks.spark.xml").load(loadPath)
with the following error message:
java.lang.ClassNotFoundException: Failed to find data source: xml. Please find packages at http://spark.apache.org/third-party-projects.html
I read several articles on this forum but none had a resolution. I thought Databricks has the XML library installed already. This is on a DBC cluster with "4.2 (includes Apache Spark 2.3.1, Scala 2.11)"