Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
03-13-2023 02:52 AM
No solution yet:
Hi @Suteja Kanuri ,
Thank you for thinking along and replying!
Unfortunately, I have not found a solution yet.
- I am getting an error that there exists no ```.getCache()``` method on a spark context. Also note that I have tried to do something similar by using ```sql_context.clearCache()``` which didn't work properly either.
- This is not the case. All data is persistent in memory (according to the SparkUI)
- This might be the problem. The dataframe is used throughout my application to calculate other dataframes. Since the persisted dataframe is used through the application and in different scopes it is very difficult / cumbersome to unpersist all dataframes that are refering to the original persistend dataframe. That is why I am trying to clear the complete cache of the cluster.
Besides these points, I am also wondering why the cache is showing up in my SparkUI and is used when applying calculations on the data, but I can not get the persistent RDD's when I am using the sql_context. (last code block in the original post).
Are there any other ideas I could try?
Kind regards,
Maarten