Pholo
Contributor

Hi @Shivers Robert​ 

Try to use something like that

import pyspark.sql.functions as F
 
def year_sum(year, column_year, column_sum):
  return F.when(
    F.col(column_year) == year, F.col(column_sum)
  ).otherwise(F.lit(None))
  
display(df.select(*[F.sum(year_sum(i, 'year', 'your_column_variable')).alias(str(i)) for i in [2018, 2019]]))
#### OR you can use the pivot method
display(df.groupby(F.lit('fake')).pivot('year').agg(F.sum('your_column_variable')).drop('fake'))

let meknow if it works.