cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

NATIVE_XML_DATA_SOURCE_NOT_ENABLED

Ajbi
New Contributor II

I'm trying to read an xml file and receiving the following error. I've installed the maven library spark xml to the cluster, however I'm receiving the error. is there anything i'm missing?

Error

AnalysisException: [NATIVE_XML_DATA_SOURCE_NOT_ENABLED] Native XML Data Source is not enabled in this cluster.

runtime version: 14.1 (includes Apache Spark 3.5.0, Scala 2.12)

library version : com.databricks:spark-xml_2.12:0.17.0

code

 

from pyspark.sql import SparkSession

# Initialize Spark session
spark = SparkSession.builder.appName("XMLRead").getOrCreate()

# Path to the XML file
xml_path = '<path>'

# Read the XML file
df = spark.read \
    .format("xml") \
    .option("rootTag", "catalog") \
    .option("rowTag", "book") \
    .load(xml_path)

# Show the DataFrame
df.show()

2 REPLIES 2

daniel_sahal
Esteemed Contributor

@Ajbi AFAIK DBR 14.1 was supposed to support XML out of the box. Maybe that's the case?

 

Can you try using spark.read.format('com.databricks.spark.xml')... instead?

Ajbi
New Contributor II

i've tried already  spark.read.format('com.databricks.spark.xml'). it receives the same error. 

 

Join 100K+ Data Experts: Register Now & Grow with Us!

Excited to expand your horizons with us? Click here to Register and begin your journey to success!

Already a member? Login and join your local regional user group! If there isn’t one near you, fill out this form and we’ll create one for you to join!