spark.apache.org
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
10-22-2021 09:06 AM
mapInPandas is one of the most powerful Spark functions. It uses an arrow-like in-memory data structure to split up Spark Data Frames into chunks and feeding them to a function that takes a Pandas DF as input and output. Check it out here:
https://spark.apache.org/docs/3.0.0/sql-pyspark-pandas-with-arrow.html#map
Labels:
- Labels:
-
Pandas
-
Pandas udf
0 REPLIES 0

