I am using databricks sql notebook to run these queries.
I have a Python UDF like
%python
from pyspark.sql.functions import udf
from pyspark.sql.types import StringType, DoubleType, DateType
def get_sell_price(sale_prices):
return sale_price[0]
spark.udf.register("get_sell_price", get_sell_price, DoubleType())
This is running on a query like
SELECT
id,
get_sell_price(sell_price)
FROM
table_name
GROUP BY
id
ORDER BY
date;
I want the sell price inside the `collect_list` to be sorted based on the specified column, but even though I mention it in the query, it still doesn't maintain the order