cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Dashboard use case - order of bars

Henrik_
New Contributor III

On a spark dataframe, is there any smart way to set the order of a categorical feature explicitly, equivalent to Categorical(ordered=listin Pandas? The use case here is a dashboard in Databricks, and I want the bars to be arranged in certain order. 

4 REPLIES 4

holly
Valued Contributor III

Hi there, you can use a map function. Create a map with the creatively named create_map, and then sort by the values in the map.

The code will look sooooomething like this (although not tested this to take it as pseudo code)

from pyspark.sql.functions import create_map, lit, col

categories=['small', 'medium', 'large', 'xlarge']

map = create_map([val for (i, category_col) in enumerate(categories) for val in (category_col, lit(i))])

#gives <'map(small, 0, medium, 1, large, 2, xlarge, 3)'>


display(df.orderBy(map[col('category_col')]))

 

Henrik_
New Contributor III

Thanks! One question, this code will order the whole dataframe based on the logic from create_map. However, I want to put on  several figures, all with their own sorting logic, on display in a dashboard. I don' think this method will work for that use-case? 

holly
Valued Contributor III

Ah, I think I see. Let's say your dataset has category_col1 with {S, M, L, XL} values, then category_col2 with {XS, S M} and you want to sort the data by category_col1 and category_col2.

If you want to specify the order for the user, you can duplicate the create_map step with and make map_1 and map_2 and then order by two columns. You can do this as part of your pipeline and save the results to your table so it's not only available as part of the dataframe.

BUT

If you want the end user to be able to sort the end Databricks visualisation / table by clicking values that's something we don't have at the moment. I think it's a sensible ask so I'll raise this with our BI team.

Henrik_
New Contributor III

Thanks for your effort!

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group