In order to gain valuable insights from large and complex data, it is necessary to use contemporary tools and technology. Organizations may enhance their performance by using data-driven choices and better knowledge of their operations with the correct tools. Databricks has built-in support for charts and visualizations in both Databricks SQL and in notebooks. On this page we will discuss another great utility for developing dashboards and applications in pure python called ‘Bokeh’.
Bokeh is a Python module for developing interactive visualizations compatible with web browsers. It enables you to create stunning visualizations, from straightforward plots to intricate dashboards with flowing statistics. Without programming any JavaScript yourself, you may build visualizations that are powered by JavaScript using Bokeh. It is a flexible visualization library that works with many different use cases.
We need to follow the below steps to create our dashboard:
1. To begin, we will first install the necessary dependencies: We will be using Flask framework to create a shareable application and geopandas to create a world map as a visualization (optional), also we will be using databricks-sql-connector for fetching data from Delta tables.
2. Next we will configure Databricks SQL connector for fetching data from delta tables into Pandas Dataframes.
Here http_path can be obtained from cluster config:
3. Now we can proceed to create charts:
We will use a data set from the retail sector in this post. It includes data on orders that the company gets from clients in various nations with varying order priority (urgent, high, medium, low, others). To investigate some of the conclusions that may be drawn from this data collection, we will utilize visualizations.
a) Line Chart: We can determine how the quantity of orders in the different order categories varies by year.
b) Bar Chart: We will plot revenue by individual countries over different years.
c) Data Table: We can analyze the revenue by Customer IDs and improve the readability of the graphical representation and figure out which customers provide the most income.
In this example let’s try to make this table a bit more beautiful by using HTML and CSS by using HTMLTemplateFormatter and adding formatter in the chart. The below code distinguishes customers based on revenue category – revenue <= $1.5M, between $1.5M – $3.0M(included) and > $3.0M:
d) Map: We can present the income from different nations in a more interesting way by using a globe map visualization.
Now we can proceed to create a full dashboard with 2 tabs:
In this Dashboard we will create a Tabbed layout with 2 tabs, between these tabs we will leverage the charts created above, With 2 Tabs with Tab 1 containing Line Chart, Bar Chart, Data Table and Tab 2 containing Map.
Tab 1:
Tab 2:
Till the above step we have created all the charts and dashboard on Notebook Interface only. To convert this dashboard to a shareable dashboard we simply have to embed this application into the Flask framework.
And this URL can be shared with other users as well.
In conclusion, Bokeh is a versatile and effective Python framework to create interactive visualizations for data exploration, analysis, and communication. Due to its user-friendly design and numerous customization possibilities, it can be an excellent tool for both new and expert users. It is a great tool for making charts that can be shared and incorporated in websites or applications due to its scalability and web friendliness. Users can quickly and easily generate complex visualizations at scale by utilising Databricks' distributed computing capabilities. They can also streamline their data analysis workflows, produce compelling graphs that effectively convey their findings, and take advantage of the platform's performance and scalability advantages. Furthermore Delta Tables’s features such as data versioning, data integrity checks, and optimizations can help with consistent and reliable data for visualization purposes. With Databricks' powerful data processing and analytics capabilities, along with Bokeh's visualization features, Users can extract key insights and make informed decisions.
Full databricks notebook can be found here :
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.