Hi @kazinahian, To create a new column called "sub_total" where you want to group by "category", "subcategory", and "monthly" sales value, you can use the groupBy().applyInPandas()
function in PySpark. This function implements the "split-apply-combine" pattern, where the data is first split into groups, a process is applied to each group, and the results are combined into a new DataFrame.