cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

agg function not working for multiple aggregations

FrancisLau
New Contributor

Data has 2 columns:

|requestDate|requestDuration|

| 2015-06-17| 104|

Here is the code:

avgSaveTimesByDate = gridSaves.groupBy(gridSaves.requestDate).agg({"requestDuration": "min", "requestDuration": "max","requestDuration": "avg"})

avgSaveTimesByDate.show(100)

Summary of Issue

I expect 4 columns of data: date, min, max and average but only the date and average shows. The first 2 aggs do not show up. If I move max to the last position, only date and max shows up. Very weird.

+-----------+--------------------+ |requestDate|AVG(requestDuration)| +-----------+--------------------+

| 2015-06-10| 750.8886326991035|

Am I doing this incorrectly? I am trying to get a dataframe for a box plot.

1 ACCEPTED SOLUTION

Accepted Solutions

ReKa
New Contributor III

My guess is that the reason this may not work is the fact that the dictionary input does not have unique keys. With this syntax, column-names are keys and if you have two or more aggregation for the same column, some internal loops may forget the non-uniqueness of the keys.

View solution in original post

2 REPLIES 2

User16826991422
Contributor

Hi Francis,

Thanks for reaching out.

I just tried this in version 2.0 of Databricks and it appeared to work as expected.

Are you using version 2.0 and Spark 1.4?

If so I would suggest using this alternate syntax:

from pyspark.sql import functions as F

aggs = df.groupBy("cut").agg(df.cut, F.min("carat"), F.max("carat"), F.avg("carat"))

Let me know if that works for you.

ReKa
New Contributor III

My guess is that the reason this may not work is the fact that the dictionary input does not have unique keys. With this syntax, column-names are keys and if you have two or more aggregation for the same column, some internal loops may forget the non-uniqueness of the keys.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.