cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

How to translate Apache Pig FOREACH GENERATE statement to Spark?

User15787040559
New Contributor III

If you have the following Apache Pig FOREACH GENERATE statement:

XBCUD_Y_TMP1 = FOREACH (FILTER XBCUD BY act_ind == 'Y') GENERATE cust_hash_key,CONCAT(brd_abbr_cd,ctry_cd) as brdCtry:chararray,updt_dt_hash_key;

the equivalent code in Apache Spark is:

XBCUD_Y_TMP1_DF = (XBCUD_DF
    .filter(col("act_ind") == "Y")
    .select(col("cust_hash_key"),
            concat(col("brd_abbr_cd"),col("ctry_cd")).alias("brdCtry"),
            col("updt_dt_hash_key"))
        )

1 REPLY 1

User15725630784
New Contributor II
New Contributor II

the equivalent code in Apache Spark is:

  1. XBCUD_Y_TMP1_DF = (XBCUD_DF
  2. .filter(col("act_ind") == "Y")
  3. .select(col("cust_hash_key"),
  4. concat(col("brd_abbr_cd"),col("ctry_cd")).alias("brdCtry"),
  5. col("updt_dt_hash_key"))
  6. )

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.