- 6875 Views
- 3 replies
- 7 kudos
Hello all,I've been tasked to convert a Scala Spark code to PySpark code with minimal changes (kinda literal translation).I've come across some code that claims to be a list comprehension. Look below for code snippet:%scala
val desiredColumn = Seq("f...
- 6875 Views
- 3 replies
- 7 kudos
Latest Reply
Another follow-up question, if you don't mind. @Pat Sienkiewicz As I was trying to parse the name column into multiple columns. I came across the data below:("James,\"A,B\", Smith", "2018", "M", 3000)In order to parse these comma-included middle na...
2 More Replies
- 1872 Views
- 2 replies
- 3 kudos
Table has the following Columns:First_Name, Last_Name, Department_Id,Contact_No, Hire_DateDisplay the emplopyee First_name, Count of Characters in the firstname,password.Password should be first 4 letters of first name in lower case and the date and ...
- 1872 Views
- 2 replies
- 3 kudos
Latest Reply
@SANJEEV BANDRU , SELECT
CONCAT(substring(First_Name, 0, 2) , substring(Hire_Date, 0, 2), substring(Hire_Date, 3, 2)) as password
FROM
table;If Hire_date is timestamp you may need to add date_format()
1 More Replies
- 2063 Views
- 0 replies
- 1 kudos
I want to convert the DataFrame to nested json. Sourse Data:-DataFrame have data value like :- As image 2 Expected Output:-I have to convert DataFrame value to Nested Json like : -As image 1Appreciate your help !
- 2063 Views
- 0 replies
- 1 kudos
- 9925 Views
- 2 replies
- 1 kudos
Lets say i have the following:<code>val dataset2 = dataset1.persist(StorageLevel.MEMORY_AND_DISK)
val dataset3 = dataset2.map(.....)1)
1)If you do a transformation on the dataset2 then you have to persist it and pass it to dataset3 and unpersist ...
- 9925 Views
- 2 replies
- 1 kudos
Latest Reply
It is well documented here : http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence
1 More Replies
- 5639 Views
- 1 replies
- 0 kudos
I have two files and I created two dataframes prod1 and prod2 out of it.I need to find the records with column names and values that are not matching in both the dfs.
id_sk is the primary key .all the cols are string datatype
dataframe 1 (prod1)
id_...
- 5639 Views
- 1 replies
- 0 kudos
Latest Reply
use full Outer Join in spark SQL