How to translate Apache Pig FOREACH GENERATE statement to Spark?
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2021 10:01 AM
If you have the following Apache Pig FOREACH GENERATE statement:
XBCUD_Y_TMP1 = FOREACH (FILTER XBCUD BY act_ind == 'Y') GENERATE cust_hash_key,CONCAT(brd_abbr_cd,ctry_cd) as brdCtry:chararray,updt_dt_hash_key;the equivalent code in Apache Spark is:
XBCUD_Y_TMP1_DF = (XBCUD_DF
.filter(col("act_ind") == "Y")
.select(col("cust_hash_key"),
concat(col("brd_abbr_cd"),col("ctry_cd")).alias("brdCtry"),
col("updt_dt_hash_key"))
)
Labels:
- Labels:
-
Apache spark
-
ApachePig
-
Pyspark
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
06-18-2021 12:28 PM
the equivalent code in Apache Spark is:
- XBCUD_Y_TMP1_DF = (XBCUD_DF
- .filter(col("act_ind") == "Y")
- .select(col("cust_hash_key"),
- concat(col("brd_abbr_cd"),col("ctry_cd")).alias("brdCtry"),
- col("updt_dt_hash_key"))
- )