<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Dashboard use case - order of bars in Data Engineering</title>
    <link>https://community.databricks.com/t5/data-engineering/dashboard-use-case-order-of-bars/m-p/87205#M37414</link>
    <description>&lt;P&gt;On a spark dataframe, is there any smart way to set the order of a categorical feature explicitly, equivalent to &lt;SPAN class=""&gt;&lt;SPAN class=""&gt;Categorical&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;EM&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;ordered&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;=list&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;)&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/EM&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;in Pandas? The use case here is a dashboard in Databricks, and I want the bars to be arranged in certain order.&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 02 Sep 2024 11:56:27 GMT</pubDate>
    <dc:creator>Henrik_</dc:creator>
    <dc:date>2024-09-02T11:56:27Z</dc:date>
    <item>
      <title>Dashboard use case - order of bars</title>
      <link>https://community.databricks.com/t5/data-engineering/dashboard-use-case-order-of-bars/m-p/87205#M37414</link>
      <description>&lt;P&gt;On a spark dataframe, is there any smart way to set the order of a categorical feature explicitly, equivalent to &lt;SPAN class=""&gt;&lt;SPAN class=""&gt;Categorical&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;(&lt;/SPAN&gt;&lt;EM&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;ordered&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;=list&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;)&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/EM&gt;&lt;SPAN class=""&gt;&lt;SPAN class=""&gt;in Pandas? The use case here is a dashboard in Databricks, and I want the bars to be arranged in certain order.&amp;nbsp;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 02 Sep 2024 11:56:27 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dashboard-use-case-order-of-bars/m-p/87205#M37414</guid>
      <dc:creator>Henrik_</dc:creator>
      <dc:date>2024-09-02T11:56:27Z</dc:date>
    </item>
    <item>
      <title>Re: Dashboard use case - order of bars</title>
      <link>https://community.databricks.com/t5/data-engineering/dashboard-use-case-order-of-bars/m-p/88042#M37472</link>
      <description>&lt;P&gt;Hi there, you can use a map function. Create a map with the creatively named &lt;A href="https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.create_map.html" target="_self"&gt;create_map&lt;/A&gt;, and then sort by the values in the map.&lt;/P&gt;
&lt;P&gt;The code will look sooooomething like this (although not tested this to take it as pseudo code)&lt;/P&gt;
&lt;PRE class="lang-py s-code-block"&gt;&lt;CODE class="hljs language-python" data-highlighted="yes"&gt;&lt;SPAN class="hljs-keyword"&gt;from&lt;/SPAN&gt; pyspark.sql.functions &lt;SPAN class="hljs-keyword"&gt;import&lt;/SPAN&gt; create_map, lit, col

categories=[&lt;SPAN class="hljs-string"&gt;'small'&lt;/SPAN&gt;, &lt;SPAN class="hljs-string"&gt;'medium'&lt;/SPAN&gt;, &lt;SPAN class="hljs-string"&gt;'large'&lt;/SPAN&gt;, &lt;SPAN class="hljs-string"&gt;'xlarge'&lt;/SPAN&gt;]

map = create_map([val &lt;SPAN class="hljs-keyword"&gt;for&lt;/SPAN&gt; (i, category_col) &lt;SPAN class="hljs-keyword"&gt;in&lt;/SPAN&gt; &lt;SPAN class="hljs-built_in"&gt;enumerate&lt;/SPAN&gt;(categories) &lt;SPAN class="hljs-keyword"&gt;for&lt;/SPAN&gt; val &lt;SPAN class="hljs-keyword"&gt;in&lt;/SPAN&gt; (category_col, lit(i))])&lt;BR /&gt;&lt;SPAN class="hljs-comment"&gt;&lt;BR /&gt;#gives &amp;lt;'map(&lt;SPAN class="hljs-string"&gt;small&lt;/SPAN&gt;, 0, &lt;SPAN class="hljs-string"&gt;medium&lt;/SPAN&gt;, 1, &lt;SPAN class="hljs-string"&gt;large&lt;/SPAN&gt;, 2, &lt;SPAN class="hljs-string"&gt;xlarge&lt;/SPAN&gt;, 3)'&amp;gt;&lt;/SPAN&gt; &lt;BR /&gt;&lt;BR /&gt;display(df.orderBy(map[col(&lt;SPAN class="hljs-string"&gt;'category_col'&lt;/SPAN&gt;)]))&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 04 Sep 2024 14:15:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dashboard-use-case-order-of-bars/m-p/88042#M37472</guid>
      <dc:creator>holly</dc:creator>
      <dc:date>2024-09-04T14:15:34Z</dc:date>
    </item>
    <item>
      <title>Re: Dashboard use case - order of bars</title>
      <link>https://community.databricks.com/t5/data-engineering/dashboard-use-case-order-of-bars/m-p/88054#M37475</link>
      <description>&lt;P&gt;Thanks! One question, this code will order the whole dataframe based on the logic from create_map. However, I want to put on&amp;nbsp; several figures, all with their own sorting logic, on display in a dashboard. I don' think this method will work for that use-case?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 03 Sep 2024 13:32:28 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dashboard-use-case-order-of-bars/m-p/88054#M37475</guid>
      <dc:creator>Henrik_</dc:creator>
      <dc:date>2024-09-03T13:32:28Z</dc:date>
    </item>
    <item>
      <title>Re: Dashboard use case - order of bars</title>
      <link>https://community.databricks.com/t5/data-engineering/dashboard-use-case-order-of-bars/m-p/88509#M37544</link>
      <description>&lt;P&gt;Ah, I think I see. Let's say your dataset has &lt;FONT face="andale mono,times"&gt;category_col1&lt;/FONT&gt; with {S, M, L, XL} values, then &lt;FONT face="andale mono,times"&gt;category_col2&lt;/FONT&gt; with {XS, S M} and you want to sort the data by&amp;nbsp;&lt;FONT face="andale mono,times"&gt;category_col1&lt;/FONT&gt; and&amp;nbsp;&lt;FONT face="andale mono,times"&gt;category_col2&lt;/FONT&gt;.&lt;/P&gt;
&lt;P&gt;If you want to&lt;STRONG&gt; specify the order&lt;/STRONG&gt; for the user, you can duplicate the &lt;FONT face="andale mono,times"&gt;create_map&lt;/FONT&gt; step with and make &lt;FONT face="andale mono,times"&gt;map_1&lt;/FONT&gt; and &lt;FONT face="andale mono,times"&gt;map_2&lt;/FONT&gt; and then order by two columns. You can do this as part of your pipeline and save the results to your table so it's not only available as part of the dataframe.&lt;/P&gt;
&lt;P&gt;BUT&lt;/P&gt;
&lt;P&gt;If you want the &lt;STRONG&gt;end user&lt;/STRONG&gt; to be able to sort the end Databricks visualisation / table by clicking values that's something we don't have at the moment. I think it's a sensible ask so I'll raise this with our BI team.&lt;/P&gt;</description>
      <pubDate>Wed, 04 Sep 2024 14:25:35 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dashboard-use-case-order-of-bars/m-p/88509#M37544</guid>
      <dc:creator>holly</dc:creator>
      <dc:date>2024-09-04T14:25:35Z</dc:date>
    </item>
    <item>
      <title>Re: Dashboard use case - order of bars</title>
      <link>https://community.databricks.com/t5/data-engineering/dashboard-use-case-order-of-bars/m-p/88515#M37545</link>
      <description>&lt;P&gt;Thanks for your effort!&lt;/P&gt;</description>
      <pubDate>Wed, 04 Sep 2024 14:27:38 GMT</pubDate>
      <guid>https://community.databricks.com/t5/data-engineering/dashboard-use-case-order-of-bars/m-p/88515#M37545</guid>
      <dc:creator>Henrik_</dc:creator>
      <dc:date>2024-09-04T14:27:38Z</dc:date>
    </item>
  </channel>
</rss>

