Databricks Community

TinaN · ‎08-06-2024

We are loading a data source that contains XML. I am translating their queries to create views in Databricks. They use 'XMLNAMESPACES' to construct/parse XML. Below is an example. What is best practice for translating 'XMLNAMESPACES' in Databricks?

CREATE OR REPLACE VIEW source_vw
AS WITH XMLNAMESPACES('uuid:ee2fbfd9-47a5-4dc8-a9eb-42d9995802ab' as REM)
SELECT...

TinaN · ‎08-08-2024

Hello Kaniz_Fatma,

Thanks for the quick response. We are experimenting with from_xml to parse the data. I appreciate your input.

Best,

Tina

View solution in original post

Retired_mod · ‎08-08-2024

Hi @TinaN, To handle XMLNAMESPACES in Databricks, use the from_xml function for parsing XML data, where you can define namespaces within your parsing logic. Start by reading the XML data using spark.read.format("xml"), then apply the from_xml function, specifying the necessary namespaces. This allows you to effectively translate SQL queries involving XML namespaces into Databricks code. If you have any specific details or additional requirements, feel free to share them!