by
KosmaS
• New Contributor III
- 10226 Views
- 3 replies
- 7 kudos
To cache/persist an action needs to be triggered. I'm just wondering, will it make any difference if, after persisting some df, I use, for instance, take(5) instead of count()?Will it be a bit more effective, because of sending results from 5 partiti...
- 10226 Views
- 3 replies
- 7 kudos
Latest Reply
Yes take (5) will be more efficient in some ways.When you cache or persist a DataFrame in Spark, you are instructing Spark to store the DataFrame's intermediate data in memory (or on disk, depending on the storage level). This can significantly speed...
2 More Replies
- 1904 Views
- 2 replies
- 0 kudos
Hi allI've used mounts based on service principals but users using shaed clusters or the new serverless they have problems with permissions to access resources on dbfs. Right now we have used clusters in single modeWhat should be the best approach to...
- 1904 Views
- 2 replies
- 0 kudos