cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Using "FOR XML PATH" in Spark SQL in sql syntax

oleole
Contributor

I'm using spark version 3.2.1 on databricks (DBR 10.4 LTS), and I'm trying to convert sql server sql query to a new sql query that runs on a spark cluster using spark sql in sql syntax. However, spark sql does not seem to support XML PATH as a function and I wonder if there is an alternative way to convert this sql server query into a sql query that spark sql will accept. The original sql server sql query looks like this:

  DROP TABLE if exists UserCountry;

  CREATE TABLE if not exists UserCountry (

   UserID INT,

   Country VARCHAR(5000)

  );

  INSERT INTO UserCountry

  SELECT 

   L.UserID AS UserID,

   COALESCE(

   STUFF(

   (SELECT ', ' + LC.Country FROM UserStopCountry LC WHERE L.UserID = LC.UserID FOR XML PATH (''))

   , 1, 2, '')

   , '') AS Country

  FROM 

  LK_ETLRunUserID L

When I run the query above in databricks spark sql, I get the following error:

ParseException: 

  mismatched input 'FOR' expecting {')', '.', '[', 'AND', 'BETWEEN', 'CLUSTER', 'DISTRIBUTE', 'DIV', 'EXCEPT', 'GROUP', 'HAVING', 'IN', 'INTERSECT', 'IS', 'LIKE', 'ILIKE', 'LIMIT', NOT, 'OR', 'ORDER', 'QUALIFY', RLIKE, 'MINUS', 'SORT', 'UNION', 'WINDOW', EQ, '<=>', '<>', '!=', '<', LTE, '>', GTE, '+', '-', '*', '/', '%', '&', '|', '||', '^', ':', '::'}(line 6, pos 80)

   

  == SQL ==

  INSERT INTO UserCountry

  SELECT 

   L.UserID AS UserID,

   COALESCE( 

   STUFF( 

   (SELECT ', ' + LC.Country FROM UserStopCountry LC WHERE L.UserID = LC.UserID FOR XML PATH ('')) 

  --------------------------------------------------------------------------------^^^

   , 1, 2, '')

   , '') AS Country

  FROM 

  LK_ETLRunUserID L

Given that the UserStopCountry looks like this:

input 

I believe the output will be:

output 

1 ACCEPTED SOLUTION

Accepted Solutions

oleole
Contributor

Posting the solution that I ended up using:

%sql
 DROP TABLE if exists UserCountry;
  CREATE TABLE if not exists UserCountry (
   UserID INT,
   Country VARCHAR(5000)
  );
  INSERT INTO UserCountry
  SELECT 
   L.UserID AS UserID,
   CONCAT_WS(',', collect_list(LC.Country)) AS Country
   COALESCE(
   STUFF(
   (SELECT ', ' + LC.Country FROM UserStopCountry LC WHERE L.UserID = LC.UserID FOR XML PATH (''))
   , 1, 2, '')
   , '') AS Country
  FROM LK_ETLRunUserID L
  INNER JOIN UserStopCountry LC
  ON L.UserID = LC.UserID
  GROUP By L.UserID

View solution in original post

3 REPLIES 3

daniel_sahal
Honored Contributor III

@Jay Yang​ 

You can use a combination of array_join and collect_list. See example below.

Source table:

imageAfter transformation:

image

Anonymous
Not applicable

Hi @Jay Yang​ 

Thank you for posting your question in our community! We are happy to assist you.

To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers your question?

This will also help other community members who may have similar questions in the future. Thank you for your participation and let us know if you need any further assistance! 

oleole
Contributor

Posting the solution that I ended up using:

%sql
 DROP TABLE if exists UserCountry;
  CREATE TABLE if not exists UserCountry (
   UserID INT,
   Country VARCHAR(5000)
  );
  INSERT INTO UserCountry
  SELECT 
   L.UserID AS UserID,
   CONCAT_WS(',', collect_list(LC.Country)) AS Country
   COALESCE(
   STUFF(
   (SELECT ', ' + LC.Country FROM UserStopCountry LC WHERE L.UserID = LC.UserID FOR XML PATH (''))
   , 1, 2, '')
   , '') AS Country
  FROM LK_ETLRunUserID L
  INNER JOIN UserStopCountry LC
  ON L.UserID = LC.UserID
  GROUP By L.UserID

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.