cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Improve query performance of direct query with Databricks

viniciuscini
New Contributor

Iโ€™m building a dashboard in Power BIโ€™s Pro Workspace, connecting data via Direct Query from Databricks (around 60 million rows from 15 combined tables), using a SQL Serverless (small size and 4 clusters).

The problem is that the dashboard is taking too long to load.

Considering that I canโ€™t change my workspace to Premium, work with import mode, or use aggregation tables, is there anything else I can do besides what Iโ€™ve already done (listed below)?

  • Model tables in a โ€œStar schemaโ€ in Power BI;
  • Materialize all tables in Databricks, avoiding as much as possible calculations and joins inside Power BI;
  • Check the โ€œReferential Integrityโ€ box when joining tables in Power BI;
  • Avoid using bi-directional filters;
  • Avoid using calculated columns or any field with โ€œline-to-line interactionโ€;
  • Avoid nesting one calculation inside another;
  • Set up dimension tables in Dual Storage Mode;
  • Add an โ€œApplyโ€ button to apply all changes at once.
  • Increase the โ€œMaximum connections per data sourceโ€ in Power BI's configuration;
  • Increase the โ€œMaximum number of simultaneous evaluationsโ€ in Power BI's configuration;
  • Increase the โ€œMaximum memory used per simultaneous evaluationโ€ in Power BI's configuration;
  • Uncheck โ€œAuto date/timeโ€ in Power BI's configuration;
  • Uncheck โ€œAllow data preview to download in the backgroundโ€ in Power BI's configuration;
  • Uncheck โ€œEnable parallel loading of tablesโ€ in Power BI's configuration;
  • Uncheck โ€œTurn on Q&A to ask natural language questions about your dataโ€ in Power BI's configuration;
  • Uncheck โ€œCross highlighting/filtering by defaultโ€ in Power BI's configuration;

P.S.: I also tried to scale up the SQL Serverless (both memory and the number of nodes), but despite the dashboard taking 3-4 minutes to load, only 20%-30% of the compute power is being used. So, it does not seem to be a matter of compute power.

Due to that, I suspect that connection latency might have something to do with the slow performance.

Vini
1 REPLY 1

Brahmareddy
Esteemed Contributor

Hi @viniciuscini, How are you doing?

It sounds like you've already taken a lot of the right steps to optimize your Power BI dashboard, especially given the constraints you mentioned. Since you've ruled out many common optimizations and noticed that the compute power isn't fully utilized, connection latency could indeed be a significant factor.

To speed up your Power BI dashboard, try reducing network latency by ensuring Power BI and Databricks are in the same region. Optimize your SQL queries in Databricks, and use caching and partitioning to improve performance. Minimize complex calculations and measures in Power BI, and check that query folding is happening. Monitor both Databricks and Power BI for any bottlenecks, and consider tweaking the parallelism settings. Lastly, ensure your machine and browser are optimized for performance.

Regards,

Brahma

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now