cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

NATIVE_XML_DATA_SOURCE_NOT_ENABLED

Ajbi
New Contributor II

I'm trying to read an xml file and receiving the following error. I've installed the maven library spark xml to the cluster, however I'm receiving the error. is there anything i'm missing?

Error

AnalysisException: [NATIVE_XML_DATA_SOURCE_NOT_ENABLED] Native XML Data Source is not enabled in this cluster.

runtime version: 14.1 (includes Apache Spark 3.5.0, Scala 2.12)

library version : com.databricks:spark-xml_2.12:0.17.0

code

 

from pyspark.sql import SparkSession

# Initialize Spark session
spark = SparkSession.builder.appName("XMLRead").getOrCreate()

# Path to the XML file
xml_path = '<path>'

# Read the XML file
df = spark.read \
    .format("xml") \
    .option("rootTag", "catalog") \
    .option("rowTag", "book") \
    .load(xml_path)

# Show the DataFrame
df.show()

2 REPLIES 2

daniel_sahal
Esteemed Contributor

@Ajbi AFAIK DBR 14.1 was supposed to support XML out of the box. Maybe that's the case?

 

Can you try using spark.read.format('com.databricks.spark.xml')... instead?

Ajbi
New Contributor II

i've tried already  spark.read.format('com.databricks.spark.xml'). it receives the same error. 

 

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now