<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Using &amp;quot;FOR XML PATH&amp;quot; in Spark SQL in sql syntax in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/using-quot-for-xml-path-quot-in-spark-sql-in-sql-syntax/m-p/7057#M3040</link>
    <description>&lt;P&gt;@Jay Yang​&amp;nbsp;&lt;/P&gt;&lt;P&gt;You can use a combination of array_join and collect_list. See example below.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Source table:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/470iFE51EE015B1E4257/image-size/large?v=v2&amp;amp;px=999" role="button" title="image" alt="image" /&gt;&lt;/span&gt;After transformation:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/479i28FABE60CBDB0972/image-size/large?v=v2&amp;amp;px=999" role="button" title="image" alt="image" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 29 Mar 2023 09:47:36 GMT</pubDate>
    <dc:creator>daniel_sahal</dc:creator>
    <dc:date>2023-03-29T09:47:36Z</dc:date>
    <item>
      <title>Using "FOR XML PATH" in Spark SQL in sql syntax</title>
      <link>https://community.databricks.com/t5/data-engineering/using-quot-for-xml-path-quot-in-spark-sql-in-sql-syntax/m-p/7056#M3039</link>
      <description>&lt;P&gt;I'm using spark version 3.2.1 on databricks (DBR 10.4 LTS), and I'm trying to convert sql server sql query to a new sql query that runs on a spark cluster using spark sql in sql syntax. However, spark sql does not seem to support XML PATH as a function and I wonder if there is an alternative way to convert this sql server query into a sql query that spark sql will accept. The original sql server sql query looks like this:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;DROP TABLE if exists UserCountry;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;CREATE TABLE if not exists UserCountry (&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;UserID INT,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;Country VARCHAR(5000)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;);&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;INSERT INTO UserCountry&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;SELECT&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;	L.UserID AS UserID,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;	COALESCE(&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;		STUFF(&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;			(SELECT ', ' + LC.Country FROM UserStopCountry LC WHERE L.UserID = LC.UserID FOR XML PATH (''))&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;			, 1, 2, '')&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;		, '') AS Country&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;FROM&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;LK_ETLRunUserID L&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;When I run the query above in databricks spark sql, I get the following error:&lt;/P&gt;&lt;P&gt;ParseException:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;mismatched input 'FOR' expecting {')', '.', '[', 'AND', 'BETWEEN', 'CLUSTER', 'DISTRIBUTE', 'DIV', 'EXCEPT', 'GROUP', 'HAVING', 'IN', 'INTERSECT', 'IS', 'LIKE', 'ILIKE', 'LIMIT', NOT, 'OR', 'ORDER', 'QUALIFY', RLIKE, 'MINUS', 'SORT', 'UNION', 'WINDOW', EQ, '&amp;lt;=&amp;gt;', '&amp;lt;&amp;gt;', '!=', '&amp;lt;', LTE, '&amp;gt;', GTE, '+', '-', '*', '/', '%', '&amp;amp;', '|', '||', '^', ':', '::'}(line 6, pos 80)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;== SQL ==&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;INSERT INTO UserCountry&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;SELECT&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;	L.UserID AS UserID,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;	COALESCE(&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;		STUFF(&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;			(SELECT ', ' + LC.Country FROM UserStopCountry LC WHERE L.UserID = LC.UserID FOR XML PATH (''))&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;--------------------------------------------------------------------------------^^^&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;			, 1, 2, '')&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;		, '') AS Country&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;FROM&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;LK_ETLRunUserID L&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Given that the UserStopCountry looks like this:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="input"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/469iCE07026F4CF2380E/image-size/large?v=v2&amp;amp;px=999" role="button" title="input" alt="input" /&gt;&lt;/span&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I believe the output will be:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="output"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/481i7A508A1CBCEDE8A2/image-size/large?v=v2&amp;amp;px=999" role="button" title="output" alt="output" /&gt;&lt;/span&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 27 Mar 2023 04:50:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/using-quot-for-xml-path-quot-in-spark-sql-in-sql-syntax/m-p/7056#M3039</guid>
      <dc:creator>oleole</dc:creator>
      <dc:date>2023-03-27T04:50:34Z</dc:date>
    </item>
    <item>
      <title>Re: Using "FOR XML PATH" in Spark SQL in sql syntax</title>
      <link>https://community.databricks.com/t5/data-engineering/using-quot-for-xml-path-quot-in-spark-sql-in-sql-syntax/m-p/7057#M3040</link>
      <description>&lt;P&gt;@Jay Yang​&amp;nbsp;&lt;/P&gt;&lt;P&gt;You can use a combination of array_join and collect_list. See example below.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Source table:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/470iFE51EE015B1E4257/image-size/large?v=v2&amp;amp;px=999" role="button" title="image" alt="image" /&gt;&lt;/span&gt;After transformation:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="image"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/479i28FABE60CBDB0972/image-size/large?v=v2&amp;amp;px=999" role="button" title="image" alt="image" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 29 Mar 2023 09:47:36 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/using-quot-for-xml-path-quot-in-spark-sql-in-sql-syntax/m-p/7057#M3040</guid>
      <dc:creator>daniel_sahal</dc:creator>
      <dc:date>2023-03-29T09:47:36Z</dc:date>
    </item>
    <item>
      <title>Re: Using "FOR XML PATH" in Spark SQL in sql syntax</title>
      <link>https://community.databricks.com/t5/data-engineering/using-quot-for-xml-path-quot-in-spark-sql-in-sql-syntax/m-p/7058#M3041</link>
      <description>&lt;P&gt;Hi @Jay Yang​&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you for posting your question in our community! We are happy to assist you.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance!&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 30 Mar 2023 07:35:01 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/using-quot-for-xml-path-quot-in-spark-sql-in-sql-syntax/m-p/7058#M3041</guid>
      <dc:creator>Anonymous</dc:creator>
      <dc:date>2023-03-30T07:35:01Z</dc:date>
    </item>
    <item>
      <title>Re: Using "FOR XML PATH" in Spark SQL in sql syntax</title>
      <link>https://community.databricks.com/t5/data-engineering/using-quot-for-xml-path-quot-in-spark-sql-in-sql-syntax/m-p/7059#M3042</link>
      <description>&lt;P&gt;Posting the solution that I ended up using:&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE&gt;%sql
 DROP TABLE if exists UserCountry;
  CREATE TABLE if not exists UserCountry (
   UserID INT,
   Country VARCHAR(5000)
  );
  INSERT INTO UserCountry
  SELECT 
   L.UserID AS UserID,
   CONCAT_WS(',', collect_list(LC.Country)) AS Country
   COALESCE(
   STUFF(
   (SELECT ', ' + LC.Country FROM UserStopCountry LC WHERE L.UserID = LC.UserID FOR XML PATH (''))
   , 1, 2, '')
   , '') AS Country
  FROM LK_ETLRunUserID L
  INNER JOIN UserStopCountry LC
  ON L.UserID = LC.UserID
  GROUP By L.UserID&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 30 Mar 2023 12:59:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/using-quot-for-xml-path-quot-in-spark-sql-in-sql-syntax/m-p/7059#M3042</guid>
      <dc:creator>oleole</dc:creator>
      <dc:date>2023-03-30T12:59:03Z</dc:date>
    </item>
  </channel>
</rss>

