Pyspark serialization
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-21-2024 04:50 AM
Hi,
I was looking for comprehensive documentation on implementing serialization in pyspark, most of the places I have seen is all about serialization with scala. Could you point out where I can get a detailed explanation on it?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-26-2024 12:37 AM
This is awesome. Thank you for replying.
I want to ask one more thing before we close this, in Scala-spark java serialization is default and additionally we have kryo serialization as well which is better. So these are not applicable in pyspark if i get correctly. Kindly confirm.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-01-2024 06:06 AM
This is great to know!
Thank you for the explanation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-26-2024 02:50 AM
This is awesome. Thank you for replying.
I want to ask one more thing before we close this, in Scala-spark java serialization is default and additionally we have kryo serialization as well which is better. So, can we use them in pyspark as well?
Another important thing, the code below creates a sparkcontext on local, that doesnt work on databricks. When I try to change the sparkcontext arguments, i get an error , attached screenshot, how can we resolve this, ultimately i dont want to run spark locally, but on databricks. Would appreciate if you answer this.
Thanks for the support.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-01-2024 04:27 AM
@Retired_mod Could you clarify on my query? Eagerly awaiting response.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-01-2024 06:05 AM
Thank you @Retired_mod for the prompt reply. This clears the things and also distinguishes between spark-scala and pyspark. Appreciate your explanation. Will apply this and also share any findings based on this which will help the community!