We must ensure you are sure that the Databricks cluster is operational. These are the steps needed for integration between Azure Databricks into Power BI Desktop.
1. Constructing the URL for the connection
Connect to the cluster, and click the Advanced Options, as shown below:
Scroll down and then select JDBC/ODBCtab. Paste the JDBC URL into Notepad. We'll need to modify this URL to establish an Spark cluster connection in Power BI Desktop:
- The first step is to change the jdbc.spark by https.
- Next, we will have to delete a few sections from it, delete from default;transportMode...to ..Path= and again from Authmech... to ...token>
- The final URL you choose should look like this : https http:/The URL you choose should look like this: http:/ .azuredatabricks.net: /sql/protocolv1/o/ /
- Keep it somewhere safe; we'll be using this in Power BI to make the connection
Step 2: Generate an access token for your personal use.
For us to be able to join our clusters we'll require a user access token from Databricks. To obtain this token you need to visit the Databricks portal and click on the icon for your user profile in the upper right-hand corner like the following:
Then, select the User Settings:
Select on the Generate New Tokenbutton on the Access Tokens tab in the following manner:
Create a description for this Token and include the duration of the Token. In this demonstration I have set the expiration date at 7 days. You can choose this number depending on your specific business requirements. Then, click to click the Generate button to generate the one you want:
src="https://www.sqlshack.com/wp-content/uploads/2020/06/generate-new-token-.png"/>
Then copy and paste it into the Token in a notepad so that you will not be able to look it up again. After saving it then select done:
It is evident that the Token is successfully generated:
Step 3 - Connect to Power BI Desktop
To integrate to work, we first need to launch Power BI Desktop. Power BI Desktop app. If you don't already have it it is possible to download the most recent version. This article will make use of the identical CSV file (1000 Records.csv) that we used in the Sales Records.csv) that we used in this article. You can then upload it to the Databricks portal by using an option called Create Table using the user interface option.
On the Power BI Desktop on the Power BI Desktop, click the Get data drop-down list and then choose the option "More...on the Home ribbon:
Choose the Other option in the Get Data dialog box and select the Spark connector. Connect: Click to connect:
In the Spark dialog box, paste the JDBC URL (created in step 1.) into the Server field. Choose HTTP to be your Protocol and DirectQuery under the Data Connectivity mode and then click OK.
In the next box, enter the token in the form of user name and paste the Token value we created in step 2 into the Password field. Click connect:
If everything is in order then it should allow you to look at all the tables within your Databricks cluster on the Power BI Navigator dialog. You can click on the table(s) and then select one of the options: the option to load the data or choose the edit option to edit the data prior to loading in Power BI Desktop.
You can now explore and visualize this data in the same way you would with any other data you can visualize in Power BI Desktop. If you're unfamiliar with Power BI Desktop visualizations, I suggest the take Power BI Course.