10-22-2024 05:27 PM
I refer this document to connect Power BI Desktop and Power BI Service to Azure Databricks.
Connect Power BI to Azure Databricks - Azure Databricks | Microsoft Learn
However, I have a couple of quesitions and concerns. Can anyone kindly help?
Thank you.
Regards,
Albert
10-22-2024 05:44 PM
Moreover, when I create an all-purpose compute for the Power BI Service, I realize that I cannot choose a service principal as the user for the single user access option. Shall I use `no isolation shared`?
10-22-2024 06:02 PM
Looks like this is the only option. I need to use multiple-node mode. Cannot use single-node mode because I also need the compute to support Unity Catalog.
single-node mode does not support adding service principal as the user for single user access for singn user mode;
single-node mode does not support shared mode.
Shared-no-isolation mode does not support unity catalog.
10-30-2024 11:10 AM
Hi guys, can anybody help?
10-30-2024 12:17 PM
Hi Albert,
why you are not using Databricks SQL Warehouse?
Here’s a list of benefits for using Databricks SQL Warehouse with Power BI, covering performance, scalability, caching, and more:
1. Optimized Performance for BI Queries: SQL Warehouses are tuned for analytical workloads, providing low-latency and high-performance query execution for Power BI dashboards and reports.
2. Scalability: Autoscaling ensures SQL Warehouses can adjust resources dynamically based on query demand, handling varying workloads without manual intervention.
3. Direct Query Support: Enables real-time data access in Power BI with DirectQuery, so users always see the latest data without requiring a local data cache.
4. Reduced Query Latency through Caching: SQL Warehouses cache query results, enabling faster response times for frequently accessed data, enhancing dashboard interactivity.
5. Resource Efficiency: Cached results lower the need for re-processing large datasets, reducing compute costs and optimizing resource usage.
6. Enhanced User Experience: With caching, Power BI users experience faster load times and smoother interactions, especially for filtering, drilling down, and refreshing views.
7. Automatic Cache Management: Databricks SQL Warehouse handles cache updates and invalidations automatically, ensuring data accuracy without manual cache clearing.
8. Query Consolidation: When multiple users or reports request similar data, the warehouse can serve these queries from cache, reducing redundant queries and improving performance during high-usage periods.
9. Connection Stability and Security: Managed connections offer robust security, including identity management, encryption, and fine-grained access controls, which are critical for compliance and secure data handling.….
10-30-2024 12:47 PM
Hi @h_h_ak Thank you for your reply. Do you mean SQL warehouses (Serverless SQL)?
One reason we don't use it is because the price. In our region, the cheapest all-purpose compute and SQL warehouse are:
I know 2X-Small is faster than D4ads v5.
However, currently we import data from Databricks to Power BI in a daily schedule. Therefore, we don't care too much about the compute startup time and the querying performance.
Moreover, Serverless SQL warehouse does not like Snowflake, the Serverless SQL warehouse will auto stop after 10 minutes of inactivity. That means, we need to pay 10 more minutes for nothing.
Therefore, I think job compute is the best option to use. However, Power BI connect does not support using Databricks job compute.
Btw, whatever compute we use, we still need to use PAT of the Service Principal to authenticate the Power BI Service with Databricks workspace.
Thank you.
Regards,
Albert
10-30-2024 01:04 PM - edited 10-30-2024 01:09 PM
Yes, you can select serverless or classic mode.
You can also set auto-termination to 1 min, if cluster is not used:
https://community.databricks.com/t5/warehousing-analytics/1-min-auto-termination/td-p/65534
But have in mind you have also to pay for the cluster-up time in case of all-purpose cluster..
And here a really new post about seamless integration:
10-30-2024 02:12 PM
Thank you for your further assistance. However, it seems like the cluster autotermination time cannot be less than 10 minutes.
Using CLI.
databricks clusters edit --json '{\"cluster_id\":\"1234-123456-abcdef1f\",\"spark_version\":\"15.4.x-scala2.12\",\"autotermination_minutes\":1,\"autoscale\":{\"max_workers\":1,\"min_workers\":1},\"node_type_id\":\"Standard_D4ads_v5\"}'
>>
Error: The cluster autotermination time cannot be less than 10 minutes.
11-12-2024 02:23 AM
For SQL Warehouse Serverless it's actually 5 minutes:
In the above example you tried to add classing compute cluster.
11-12-2024 03:37 AM
Hi @AlbertWang
You have multiple options to connect Power BI with Databricks:
Using Cluster Credentials: Under the cluster details, go to the Advanced Options and select JDBC/ODBC. Here, you’ll find the necessary credentials, such as hostname and HTTP path, for connecting.
Using SQL Warehouse: Go to the SQL Warehouse section and select Connection Details, where you’ll find the hostname and HTTP path needed for the connection.
Using Partner Connect: Click on Power BI within Partner Connect. This will prompt you to download a Power BI connection file for easy setup.
Using Delta Sharing: You can also connect Power BI with Databricks via Delta Sharing for data access.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group