Data Engineering

Forum Posts

Sorted by:

by RiyazAliM • Honored Contributor

11-09-2022 6:59:13 AM

8525 Views
3 replies
7 kudos

Resolved! Converting a transformation written in Spark Scala to PySpark

Hello all,I've been tasked to convert a Scala Spark code to PySpark code with minimal changes (kinda literal translation).I've come across some code that claims to be a list comprehension. Look below for code snippet:%scala val desiredColumn = Seq("f...

Data Engineering

8525 Views
3 replies
7 kudos

11-09-2022 6:59:13 AM

View Replies

Latest Reply

RiyazAliM
Honored Contributor

11-10-2022 2:43:37 AM

7 kudos

Another follow-up question, if you don't mind. @Pat Sienkiewicz As I was trying to parse the name column into multiple columns. I came across the data below:("James,\"A,B\", Smith", "2018", "M", 3000)In order to parse these comma-included middle na...

7 kudos

11-10-2022 2:43:37 AM

2 More Replies

by sannycse • New Contributor II

03-30-2022 11:51:13 AM

2591 Views
2 replies
3 kudos

Resolved! display password as shown in example using spark scala

Table has the following Columns:First_Name, Last_Name, Department_Id,Contact_No, Hire_DateDisplay the emplopyee First_name, Count of Characters in the firstname,password.Password should be first 4 letters of first name in lower case and the date and ...

Data Engineering

2591 Views
2 replies
3 kudos

03-30-2022 11:51:13 AM

View Replies

Latest Reply

Hubert-Dudek
Esteemed Contributor III

03-30-2022 12:15:23 PM

3 kudos

@SANJEEV BANDRU , SELECT CONCAT(substring(First_Name, 0, 2) , substring(Hire_Date, 0, 2), substring(Hire_Date, 3, 2)) as password FROM table;If Hire_date is timestamp you may need to add date_format()

3 kudos

03-30-2022 12:15:23 PM

1 More Replies

by suman9872 • New Contributor II

02-23-2022 6:33:16 AM

2445 Views
0 replies
1 kudos

How to dynamically convert Spark DataFrame to Nested json using Spark Scala

I want to convert the DataFrame to nested json. Sourse Data:-DataFrame have data value like :- As image 2 Expected Output:-I have to convert DataFrame value to Nested Json like : -As image 1Appreciate your help !

Data Engineering

2445 Views
0 replies
1 kudos

02-23-2022 6:33:16 AM

by paourissi • New Contributor

11-22-2015 1:03:31 PM

11113 Views
2 replies
1 kudos

When to persist and when to unpersist RDD in Spark

Lets say i have the following:<code>val dataset2 = dataset1.persist(StorageLevel.MEMORY_AND_DISK) val dataset3 = dataset2.map(.....)1) 1)If you do a transformation on the dataset2 then you have to persist it and pass it to dataset3 and unpersist ...

Data Engineering

11113 Views
2 replies
1 kudos

11-22-2015 1:03:31 PM

View Replies

Latest Reply

Arun_KumarPT
New Contributor II

11-24-2015 10:10:50 PM

1 kudos

It is well documented here : http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence

1 kudos

11-24-2015 10:10:50 PM

1 More Replies

by shampa • New Contributor

01-19-2019 11:00:31 PM

6387 Views
1 replies
0 kudos

How can we compare two dataframes in spark scala to find difference between these 2 files, which column ?? and value ??.

I have two files and I created two dataframes prod1 and prod2 out of it.I need to find the records with column names and values that are not matching in both the dfs. id_sk is the primary key .all the cols are string datatype dataframe 1 (prod1) id_...

Data Engineering

6387 Views
1 replies
0 kudos

01-19-2019 11:00:31 PM

View Replies

Latest Reply

manojlukhi
New Contributor II

02-04-2019 10:14:48 PM

0 kudos

use full Outer Join in spark SQL

0 kudos

02-04-2019 10:14:48 PM

Databricks Community

Resolved! Converting a transformation written in Spark Scala to PySpark

Resolved! display password as shown in example using spark scala

How to dynamically convert Spark DataFrame to Nested json using Spark Scala

When to persist and when to unpersist RDD in Spark

How can we compare two dataframes in spark scala to find difference between these 2 files, which column ?? and value ??.