cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

RiyazAli
by Valued Contributor
  • 4390 Views
  • 3 replies
  • 7 kudos

Resolved! Converting a transformation written in Spark Scala to PySpark

Hello all,I've been tasked to convert a Scala Spark code to PySpark code with minimal changes (kinda literal translation).I've come across some code that claims to be a list comprehension. Look below for code snippet:%scala val desiredColumn = Seq("f...

  • 4390 Views
  • 3 replies
  • 7 kudos
Latest Reply
RiyazAli
Valued Contributor
  • 7 kudos

Another follow-up question, if you don't mind. @Pat Sienkiewicz​ As I was trying to parse the name column into multiple columns. I came across the data below:("James,\"A,B\", Smith", "2018", "M", 3000)In order to parse these comma-included middle na...

  • 7 kudos
2 More Replies
suman9872
by New Contributor II
  • 1268 Views
  • 1 replies
  • 1 kudos

How to dynamically convert Spark DataFrame to Nested json using Spark Scala

I want to convert the DataFrame to nested json. Sourse Data:-DataFrame have data value like :- As image 2 Expected Output:-I have to convert DataFrame value to Nested Json like : -As image 1Appreciate your help !

  • 1268 Views
  • 1 replies
  • 1 kudos
Latest Reply
Kaniz
Community Manager
  • 1 kudos

Hi @Suman Mishra​, This article explains how to convert a flattened DataFrame to a nested structure by nesting a case class within another case class.You can use this technique to build a JSON file that can then be sent to an external API.

  • 1 kudos
sannycse
by New Contributor II
  • 938 Views
  • 2 replies
  • 3 kudos

Resolved! display password as shown in example using spark scala

Table has the following Columns:First_Name, Last_Name, Department_Id,Contact_No, Hire_DateDisplay the emplopyee First_name, Count of Characters in the firstname,password.Password should be first 4 letters of first name in lower case and the date and ...

  • 938 Views
  • 2 replies
  • 3 kudos
Latest Reply
Hubert-Dudek
Esteemed Contributor III
  • 3 kudos

@SANJEEV BANDRU​ , SELECT CONCAT(substring(First_Name, 0, 2) , substring(Hire_Date, 0, 2), substring(Hire_Date, 3, 2)) as password FROM table;If Hire_date is timestamp you may need to add date_format()

  • 3 kudos
1 More Replies
paourissi
by New Contributor
  • 7565 Views
  • 2 replies
  • 1 kudos

When to persist and when to unpersist RDD in Spark

Lets say i have the following:<code>val dataset2 = dataset1.persist(StorageLevel.MEMORY_AND_DISK) val dataset3 = dataset2.map(.....)1) 1)If you do a transformation on the dataset2 then you have to persist it and pass it to dataset3 and unpersist ...

  • 7565 Views
  • 2 replies
  • 1 kudos
Latest Reply
Arun_KumarPT
New Contributor II
  • 1 kudos

It is well documented here : http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence

  • 1 kudos
1 More Replies
shampa
by New Contributor
  • 4061 Views
  • 1 replies
  • 0 kudos

How can we compare two dataframes in spark scala to find difference between these 2 files, which column ?? and value ??.

I have two files and I created two dataframes prod1 and prod2 out of it.I need to find the records with column names and values that are not matching in both the dfs. id_sk is the primary key .all the cols are string datatype dataframe 1 (prod1) id_...

  • 4061 Views
  • 1 replies
  • 0 kudos
Latest Reply
manojlukhi
New Contributor II
  • 0 kudos

use full Outer Join in spark SQL

  • 0 kudos
Labels