how to dynamically perform aggregation on all columns in a data frame even when some columns have different types like int , double string datetime or float in pyspark (i have 140-200 columns and need to perform aggregation/avg on each column)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-01-2023 07:02 AM
need to aggregate all the numerical columns but need to this dynamically
- Labels:
-
Columns
-
Different Types
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2023 08:57 PM
Hi, Have you tried using the aggregate function which may help in this case?
https://docs.databricks.com/sql/language-manual/functions/aggregate.html
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-05-2023 11:16 PM
Hi @sandeep tummala ,
Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.
Please help us select the best solution by clicking on "Select As Best" if it does.
Your feedback will help us ensure that we are providing the best possible service to you.
Thank you!

