Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
09-22-2021 02:41 PM
The issue is probably related to the self join between 100 million rows, I'm not positive without seeing the code and understanding the problem better but you may want to think about using windowing functions instead
https://blog.knoldus.com/using-windows-in-spark-to-avoid-joins/