Re: Are window functions more performant than self...

pvignesh92 · ‎03-21-2023

Hi @Matthew Elsham, As @Lakshay Goel pointed out, I would believe Windows would work better as it will first partition them based on the partition key and then your aggregation happens within that partition on a single worker. But it's good to see your query plan for both these cases and understand with your data what works best for you.

One good blog I checked was this - https://blog.knoldus.com/using-windows-in-spark-to-avoid-joins/. Please have a look.

View solution in original post