cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

I am looking to use the pivot function with Spark SQL (not Python)

nickg
New Contributor III

Hello. I am trying to using the Pivot function for email addresses. This is what I have so far:

Select fname, lname, awUniqueID, Email1, Email2

From xxxxxxxx

Pivot (

    count(Email) as Test

    For Email

    In (1 as Email1, 2 as Email2)

    )

I get everything I need except Email1 and Email2 have null values instead of actual email addresses. I just need a little push to get me over the edge. Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions

Hubert-Dudek
Esteemed Contributor III

Please add example source data and desired output.

  • "For Email" should be a list of columns to replace. In the source, we have Email1, Email2, so instead, it can be FOR (email1, email2)
  • count the same probably needed is count(awUniqueID)
  • Is 1 and 2 values in Email1 and Email2? If not, another solution is necessary, but it is hard to help without an example

View solution in original post

6 REPLIES 6

Hubert-Dudek
Esteemed Contributor III

Please add example source data and desired output.

  • "For Email" should be a list of columns to replace. In the source, we have Email1, Email2, so instead, it can be FOR (email1, email2)
  • count the same probably needed is count(awUniqueID)
  • Is 1 and 2 values in Email1 and Email2? If not, another solution is necessary, but it is hard to help without an example

nickg
New Contributor III

source data:

fname lname awUniqueID Email

John Smith 22 jsmith@gmail.com

JODI JONES 22 jsmith@live.com

Desired output:

fname lname awUniqueID Email1 Email2

John Smith 22 jsmith@gmail.com jsmith@live.com

nickg
New Contributor III

Oops. The second name in the source data should be John Smith as well.

Hubert-Dudek
Esteemed Contributor III

Just create a copy of dataframe (or temporary view) rename the Email column to Email2 and than join on with source awUniqueID.

nickg
New Contributor III

Thanks Hubert. I did that and it worked. I still want to get 'Pivot' to work as well.

nickg
New Contributor III

Hi Kaniz,

Thanks for your message. I was able to make it work with the workaround that Hubert provided. I would eventually like to make it work with the 'Pivot' command. I have not revisited it and tested the 'Pivot' command as I was on vacation for a couple of weeks.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group