Pyspark dataframe column comparison

UmaMahesh1 · ‎12-01-2022

I have a string column which is a concatenation of elements with a hyphen as follows. Let 3 values from that column looks like below,

Row 1 - A-B-C-D-E-F

Row 2 - A-B-G-C-D-E-F

Row 3 - A-B-G-D-E-F

I want to compare 2 consecutive rows and create a column with what has changed. Specifically, 4 comparisons

So my output will look like this

Row1 ; null

Row2 : G added

Row3 : C Removed

any ideas/suggestions ?

Uma Mahesh D