I'm trying to read an xml file and receiving the following error. I've installed the maven library spark xml to the cluster, however I'm receiving the error. is there anything i'm missing?
Error
AnalysisException: [NATIVE_XML_DATA_SOURCE_NOT_ENABLED] Native XML Data Source is not enabled in this cluster.
runtime version: 14.1 (includes Apache Spark 3.5.0, Scala 2.12)
library version : com.databricks:spark-xml_2.12:0.17.0
code
from pyspark.sql import SparkSession
# Initialize Spark session
spark = SparkSession.builder.appName("XMLRead").getOrCreate()
# Path to the XML file
xml_path = '<path>'
# Read the XML file
df = spark.read \
.format("xml") \
.option("rootTag", "catalog") \
.option("rowTag", "book") \
.load(xml_path)
# Show the DataFrame
df.show()