<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to extract a full node from an xml string using sql in Warehousing &amp; Analytics</title>
    <link>https://community.databricks.com/t5/warehousing-analytics/how-to-extract-a-full-node-from-an-xml-string-using-sql/m-p/156482#M2573</link>
    <description>&lt;P&gt;Thanks very much.&amp;nbsp; I had over-simplified my example perhaps, and&amp;nbsp;got stuck on the idea of using xml functions, particularly because I'm converting existing SQL from another DB, and wasn't being open-minded enough.&lt;/P&gt;&lt;P&gt;The actual data has multiple instances of the b element and varying contents of each that I want to retrieve as an array.&amp;nbsp; Using your first suggestion but with regexp_extract_all works perfectly e.g.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;SELECT regexp_extract_all('&amp;lt;a&amp;gt;&amp;lt;b&amp;gt;&amp;lt;c&amp;gt;c1&amp;lt;/c&amp;gt;&amp;lt;d&amp;gt;1&amp;lt;/d&amp;gt;&amp;lt;/b&amp;gt;&amp;lt;b&amp;gt;&amp;lt;c&amp;gt;c2&amp;lt;/c&amp;gt;&amp;lt;e&amp;gt;x&amp;lt;/e&amp;gt;&amp;lt;/b&amp;gt;&amp;lt;/a&amp;gt;', '(&amp;lt;b&amp;gt;.*?&amp;lt;/b&amp;gt;)', 1) out&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sat, 09 May 2026 07:47:08 GMT</pubDate>
    <dc:creator>azl</dc:creator>
    <dc:date>2026-05-09T07:47:08Z</dc:date>
    <item>
      <title>How to extract a full node from an xml string using sql</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/how-to-extract-a-full-node-from-an-xml-string-using-sql/m-p/156476#M2571</link>
      <description>&lt;P&gt;I have an XML string in my table and I want to extract a full node from it using SQL functions, without needing to parse it in full i.e. from&amp;nbsp;'&amp;lt;a&amp;gt;&amp;lt;b&amp;gt;&amp;lt;c&amp;gt;c1&amp;lt;/c&amp;gt;&amp;lt;/b&amp;gt;&amp;lt;/a&amp;gt;' I want '&amp;lt;b&amp;gt;&amp;lt;c&amp;gt;c1&amp;lt;/c&amp;gt;&amp;lt;/b&amp;gt;' (preferably) or '&amp;lt;c&amp;gt;c1&amp;lt;/c&amp;gt;'.&lt;/P&gt;&lt;P&gt;What little I can find that does not use Spark suggests it should work but I just get null running on&amp;nbsp;DBSQL v 2026.10.&lt;/P&gt;&lt;P&gt;I've tried:&lt;BR /&gt;SELECT xpath('&amp;lt;a&amp;gt;&amp;lt;b&amp;gt;&amp;lt;c&amp;gt;c1&amp;lt;/c&amp;gt;&amp;lt;/b&amp;gt;&amp;lt;/a&amp;gt;', '/a/b')&lt;BR /&gt;SELECT xpath('&amp;lt;a&amp;gt;&amp;lt;b&amp;gt;&amp;lt;c&amp;gt;c1&amp;lt;/c&amp;gt;&amp;lt;/b&amp;gt;&amp;lt;/a&amp;gt;', '/a/b/.')&lt;BR /&gt;SELECT xpath('&amp;lt;a&amp;gt;&amp;lt;b&amp;gt;&amp;lt;c&amp;gt;c1&amp;lt;/c&amp;gt;&amp;lt;/b&amp;gt;&amp;lt;/a&amp;gt;', '/a/b/node()')&lt;BR /&gt;SELECT xpath('&amp;lt;a&amp;gt;&amp;lt;b&amp;gt;&amp;lt;c&amp;gt;c1&amp;lt;/c&amp;gt;&amp;lt;/b&amp;gt;&amp;lt;/a&amp;gt;', '//b')&lt;/P&gt;&lt;P&gt;Please help.&lt;/P&gt;</description>
      <pubDate>Sat, 09 May 2026 02:46:02 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/how-to-extract-a-full-node-from-an-xml-string-using-sql/m-p/156476#M2571</guid>
      <dc:creator>azl</dc:creator>
      <dc:date>2026-05-09T02:46:02Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract a full node from an xml string using sql</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/how-to-extract-a-full-node-from-an-xml-string-using-sql/m-p/156481#M2572</link>
      <description>&lt;P&gt;xpath returns an array of string values containing text content from matching nodes.&lt;/P&gt;&lt;P&gt;You can follow below&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;Reg Exp Extract&lt;/LI&gt;&lt;/UL&gt;&lt;PRE&gt;SELECT&lt;SPAN&gt; regexp_extract(&lt;/SPAN&gt;&lt;SPAN&gt;'&amp;lt;a&amp;gt;&amp;lt;b&amp;gt;&amp;lt;c&amp;gt;c1&amp;lt;/c&amp;gt;&amp;lt;/b&amp;gt;&amp;lt;/a&amp;gt;'&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;'(&amp;lt;b&amp;gt;.*?&amp;lt;/b&amp;gt;)'&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;1&lt;/SPAN&gt;&lt;SPAN&gt;) &lt;/SPAN&gt;&lt;SPAN&gt;out&lt;/SPAN&gt;&lt;/PRE&gt;&lt;UL&gt;&lt;LI&gt;XML Functions &amp;amp; Reg Exp&lt;/LI&gt;&lt;/UL&gt;&lt;DIV&gt;&lt;PRE&gt;&lt;SPAN&gt;SELECT&lt;/SPAN&gt; &lt;SPAN&gt;trim&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; regexp_replace(&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; regexp_replace(&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; to_xml(named_struct(&lt;/SPAN&gt;&lt;SPAN&gt;'b'&lt;/SPAN&gt;&lt;SPAN&gt;, from_xml(&lt;/SPAN&gt;&lt;SPAN&gt;'&amp;lt;a&amp;gt;&amp;lt;b&amp;gt;&amp;lt;c&amp;gt;c1&amp;lt;/c&amp;gt;&amp;lt;/b&amp;gt;&amp;lt;/a&amp;gt;'&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;'struct&amp;lt;b:struct&amp;lt;c:string&amp;gt;&amp;gt;'&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;map&lt;/SPAN&gt;&lt;SPAN&gt;(&lt;/SPAN&gt;&lt;SPAN&gt;'rowTag'&lt;/SPAN&gt;&lt;SPAN&gt;, &lt;/SPAN&gt;&lt;SPAN&gt;'a'&lt;/SPAN&gt;&lt;SPAN&gt;)).b)),&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;'&amp;lt;ROW&amp;gt;|&amp;lt;/ROW&amp;gt;'&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;''&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; ),&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;'&amp;gt;\\s+&amp;lt;'&lt;/SPAN&gt;&lt;SPAN&gt;,&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;SPAN&gt;'&amp;gt;&amp;lt;'&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;&amp;nbsp; )&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN&gt;) &lt;/SPAN&gt;&lt;SPAN&gt;out&lt;/SPAN&gt;&lt;/PRE&gt;&lt;UL&gt;&lt;LI&gt;Create Python scalar UDF for XML extraction &amp;amp; r&lt;SPAN&gt;egister it as a UDF and use it from SQL&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/DIV&gt;</description>
      <pubDate>Sat, 09 May 2026 07:09:00 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/how-to-extract-a-full-node-from-an-xml-string-using-sql/m-p/156481#M2572</guid>
      <dc:creator>balajij8</dc:creator>
      <dc:date>2026-05-09T07:09:00Z</dc:date>
    </item>
    <item>
      <title>Re: How to extract a full node from an xml string using sql</title>
      <link>https://community.databricks.com/t5/warehousing-analytics/how-to-extract-a-full-node-from-an-xml-string-using-sql/m-p/156482#M2573</link>
      <description>&lt;P&gt;Thanks very much.&amp;nbsp; I had over-simplified my example perhaps, and&amp;nbsp;got stuck on the idea of using xml functions, particularly because I'm converting existing SQL from another DB, and wasn't being open-minded enough.&lt;/P&gt;&lt;P&gt;The actual data has multiple instances of the b element and varying contents of each that I want to retrieve as an array.&amp;nbsp; Using your first suggestion but with regexp_extract_all works perfectly e.g.&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;SELECT regexp_extract_all('&amp;lt;a&amp;gt;&amp;lt;b&amp;gt;&amp;lt;c&amp;gt;c1&amp;lt;/c&amp;gt;&amp;lt;d&amp;gt;1&amp;lt;/d&amp;gt;&amp;lt;/b&amp;gt;&amp;lt;b&amp;gt;&amp;lt;c&amp;gt;c2&amp;lt;/c&amp;gt;&amp;lt;e&amp;gt;x&amp;lt;/e&amp;gt;&amp;lt;/b&amp;gt;&amp;lt;/a&amp;gt;', '(&amp;lt;b&amp;gt;.*?&amp;lt;/b&amp;gt;)', 1) out&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 09 May 2026 07:47:08 GMT</pubDate>
      <guid>https://community.databricks.com/t5/warehousing-analytics/how-to-extract-a-full-node-from-an-xml-string-using-sql/m-p/156482#M2573</guid>
      <dc:creator>azl</dc:creator>
      <dc:date>2026-05-09T07:47:08Z</dc:date>
    </item>
  </channel>
</rss>

