cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Databricks Optimization

Akshay9
New Contributor

I am trying to read 30 xml files and create a dataframe of the data of each node but i takes alot of time approximately 8 mins to run those files what i can i do to optimize the databricks notebook and i append the data in a databricks delta table 

1 REPLY 1

Kaniz
Community Manager
Community Manager

Hi @Akshay9 , Use spark.xml() or com.databricks.spark.xml package: This package provides an efficient way to read and write XML datasets in Spark. It can automatically infer the XML schema and provide additional options for fine-tuning the schema. You can use this package to read the XML data and convert it into a Spark DataFrame.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.