cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

User15787040559
by New Contributor III
  • 968 Views
  • 2 replies
  • 0 kudos

How to translate Apache Pig FILTER statement to Spark?

If you have the following Apache Pig FILTER statement:XCOCD_ACT_Y = FILTER XCOCD BY act_ind == 'Y';the equivalent code in Apache Spark is:XCOCD_ACT_Y_DF = (XCOCD_DF .filter(col("act_ind") == "Y"))

  • 968 Views
  • 2 replies
  • 0 kudos
Latest Reply
FeliciaWilliam
New Contributor III
  • 0 kudos

Translating an Apache Pig FILTER statement to Spark requires understanding the differences in syntax and functionality between the two processing frameworks. While both aim to filter data, Spark uses a different syntax and approach, typically involvi...

  • 0 kudos
1 More Replies
User15787040559
by New Contributor III
  • 1006 Views
  • 1 replies
  • 0 kudos

How to translate Apache Pig LOAD statement to Spark?

If you have the following Apache Pig LOAD statement:TOCCT = LOAD 'db_custbase.ods_corp_cust_t' using $HCatLoader;the equivalent code in Apache Spark is:TOCCT_DF = spark.read.table("db_custbase.ods_corp_cust_t")

  • 1006 Views
  • 1 replies
  • 0 kudos
Latest Reply
Kaniz
Community Manager
  • 0 kudos

Hi @User15787040559729892342! My name is Kaniz, and I'm a technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers on the Forum have an answer to your questions first. Or else I will follow up shortly with a ...

  • 0 kudos
User15787040559
by New Contributor III
  • 802 Views
  • 1 replies
  • 0 kudos

How to translate Apache Pig FOREACH GENERATE statement to Spark?

If you have the following Apache Pig FOREACH GENERATE statement:XBCUD_Y_TMP1 = FOREACH (FILTER XBCUD BY act_ind == 'Y') GENERATE cust_hash_key,CONCAT(brd_abbr_cd,ctry_cd) as brdCtry:chararray,updt_dt_hash_key;the equivalent code in Apache Spark is:XB...

  • 802 Views
  • 1 replies
  • 0 kudos
Latest Reply
User15725630784
New Contributor II
  • 0 kudos

the equivalent code in Apache Spark is:XBCUD_Y_TMP1_DF = (XBCUD_DF .filter(col("act_ind") == "Y") .select(col("cust_hash_key"), concat(col("brd_abbr_cd"),col("ctry_cd")).alias("brdCtry"), col("updt_dt_hash_key")) )

  • 0 kudos
Labels