cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

NATIVE_XML_DATA_SOURCE_NOT_ENABLED

Ajbi
New Contributor II

I'm trying to read an xml file and receiving the following error. I've installed the maven library spark xml to the cluster, however I'm receiving the error. is there anything i'm missing?

Error

AnalysisException: [NATIVE_XML_DATA_SOURCE_NOT_ENABLED] Native XML Data Source is not enabled in this cluster.

runtime version: 14.1 (includes Apache Spark 3.5.0, Scala 2.12)

library version : com.databricks:spark-xml_2.12:0.17.0

code

 

from pyspark.sql import SparkSession

# Initialize Spark session
spark = SparkSession.builder.appName("XMLRead").getOrCreate()

# Path to the XML file
xml_path = '<path>'

# Read the XML file
df = spark.read \
    .format("xml") \
    .option("rootTag", "catalog") \
    .option("rowTag", "book") \
    .load(xml_path)

# Show the DataFrame
df.show()

2 REPLIES 2

daniel_sahal
Esteemed Contributor

@Ajbi AFAIK DBR 14.1 was supposed to support XML out of the box. Maybe that's the case?

 

Can you try using spark.read.format('com.databricks.spark.xml')... instead?

Ajbi
New Contributor II

i've tried already  spark.read.format('com.databricks.spark.xml'). it receives the same error. 

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group