Dashboard performance issues rarely come from a single place. They’re usually the combined effect of dashboard design, warehouse concurrency and caching and data layout in your lakehouse. If you optimize only one layer—SQL, or compute sizing, or table layout—you’ll often see partial wins, but the dashboard can still feel slow or unpredictable under real usage.
In this post, we take a holistic approach to Databricks AI/BI performance. We’ll follow a dashboard interaction end-to-end: from the browser and AI/BI orchestration layer, through Databricks SQL admission and caching behavior, down to file scanning and data skipping in the Lakehouse. Along the way, we’ll highlight the patterns that most often drive latency spikes, queueing, and cost at scale—especially when many users interact with the same dashboards concurrently.
To optimize performance, you must first understand the journey a single click takes through the stack. When a user opens a dashboard or changes a filter, a chain reaction occurs across multiple layers. If any layer is misconfigured, the user feels the lag.
By optimizing each of these four touchpoints, you move away from brute-force compute and toward a streamlined architecture that scales with your users.
Before optimizing anything, you must first define what you are optimizing for. Dashboard performance is not a single concept, and improvements only make sense when tied to a clear target. Common goals include reducing time to first visual, improving interaction latency, keeping performance stable under concurrency, or lowering the cost per dashboard view.
Once the goal is clear, you need to understand the parameters that shape it. These include the size and growth of the data, the number of users and their access patterns, and how queries behave in practice—how many fires on page load, how much data they scan, and whether results are reused or constantly recomputed. Without this context, optimization becomes guesswork and often shifts cost or latency from one layer to another.
Effective dashboard optimization is, therefore, intentional: pick a measurable target, understand the data and usage patterns that influence it, and only then apply the technical optimizations that follow.
Every visible tile is a potential trigger: it runs on first load and can re-run when filters/parameters change, on refresh, and when users navigate back to a page. Tabs limit those re-executions to the active page, reducing bursts and head-of-line blocking.
AI/BI dashboards let you build multi‑page reports. Group visuals into pages aligned to user intent (Overview → Investigate → Deep dive), so only the current page executes. This reduces head‑of‑line blocking, shapes concurrency into smaller bursts, and increases cache hit rates for repeated deterministic queries.
Recommended page types:
Favor deterministic tiles (avoid NOW()) to maximize result cache hits, monitor Peak Queued Queries and increase cluster size or max clusters if persistently > 0.
The drill-through feature in AI/BI Dashboards enables navigation from high-level visuals to detailed pages while carrying the selected context. It is a useful strategy to enforce a page-based design by deferring expensive queries until user intent is clear, improving first-paint performance and reducing unnecessary concurrency spikes.
Callout — Why this helps on any warehouse type: Smaller, predictable bursts make Serverless IWM react fast and avoid over-scaling, and they prevent Pro/Classic from saturating cluster slots during page loads.
For more details, see: https://www.databricks.com/blog/whats-new-in-aibi-dashboards-fall24
The first impression of a dashboard is defined by its first paint—the time it takes from opening the dashboard to seeing meaningful results. Default filter values play a critical role here because they determine which queries run immediately on page load and how much data those queries must process.
When filters have no defaults, AI/BI dashboards often load the entire dataset on first open. This maximizes scan volume, increases query fan-out across tiles, and delays the moment when users see usable insights. The result is a slow, unpredictable first experience—especially painful during peak concurrency when many users open the same dashboard.
Smart defaults fix this by constraining the initial query shape. Common examples include:
Technically, default filters reduce the amount of data scanned, improve cache hit rates, and allow result reuse across users who open the dashboard with the same initial state. This directly improves time-to-first-visual and makes performance far more consistent.
The key design principle is simple: optimize for the landing experience. Make the first paint fast and informative, then let users broaden the scope intentionally as they explore. A fast first paint builds trust, encourages adoption, and sets the performance baseline for every interaction that follows.
For more details, see: https://docs.databricks.com/aws/en/dashboards/filters#-set-default-filter-values
Parameters are one of the most effective ways to scale AI/BI dashboards because they shape the query before it runs. By injecting values directly into the SQL, they push predicates down early—so Databricks can prune data sooner and do far less work per query (ideally filtering before joins and aggregations).
Field filters behave differently. If the underlying dataset is small enough to be cached in the browser (≤ 100K rows and ≤ 100MB), field filters and cross-filter interactions can be evaluated client-side with no warehouse round-trip. Once datasets exceed that threshold, field filters are typically pushed to the warehouse by wrapping the dataset query (often via a CTE), which triggers backend SQL execution and may not reduce scan cost as effectively as parameterized predicates applied before joins and aggregation.
Parameters are especially effective for slicing by date ranges, regions, business units, or other dimensions that significantly reduce data volume. They also make performance more predictable: each tile executes cheaper queries, and concurrency spikes become easier for the warehouse to absorb.
There is a trade-off. Because different parameter values produce different query signatures, cache reuse can be lower when users constantly choose unique values. In practice, this is usually the right compromise: a small, cheap query without a cache hit is far better than a large, expensive query that depends on caching to survive. You can further balance this by using sensible defaults and a limited set of common parameter values so popular paths still benefit from the result cache.
A practical rule of thumb is: keep datasets small enough to fit in the browser cache and use field filters for interactivity (best case: no warehouse round-trips). If you’re not confident the dataset will reliably stay within the browser-cache limits, use parameters to reduce the dataset early—so the warehouse reads less data up front—and then apply filters for deeper exploration. This turns “scan everything and filter later” dashboards into selective, scalable queries that stay fast as data and users grow.
For more details see:
Browser caching helps most when interactions are field-filter based on small datasets. Once parameters modify SQL—or datasets exceed the browser threshold—interactions shift back to warehouse execution.
Databricks AI/BI dashboards can cache dataset results directly in the user’s browser, allowing many interactions to be handled entirely client-side without round-trips to the SQL warehouse.
When a dataset is below roughly 100,000 rows, the browser becomes a local execution engine. Cross-filtering, sorting, and simple aggregations can be resolved instantly in memory, producing near-zero-latency interactions and eliminating backend concurrency pressure altogether. This is why well-designed overview pages often feel “instant,” even under heavy usage.
The browser cache is automatically used when:
Once datasets grow beyond this threshold—or when parameters modify the underlying SQL—the interaction is pushed back to the warehouse and browser caching no longer applies.
The key design takeaway is intentional dataset sizing. Keep landing-page and overview datasets compact, pre-aggregated, and focused on common KPIs so they qualify for browser caching. This delivers instant “first paint,” reduces query fan-out, and preserves warehouse capacity for deeper investigative pages where backend execution is unavoidable.
For more details,see: https://docs.databricks.com/aws/en/dashboards/caching#dataset-optimizations
The cheapest, fastest query is the one you don’t execute.
Databricks SQL result caching turns repeated dashboard interactions into near-instant responses by serving results from cache instead of recomputing them.
Databricks SQL checks cached results before executing a query:
Both caches have a roughly 24-hour TTL and are invalidated when the underlying tables change, so you keep freshness while benefiting from reuse.
Cache hits are not automatic — they’re a design outcome. You get the most value when many users ask the warehouse the same question in the same way.
1) Make tiles deterministic
Avoid non-deterministic functions (e.g., NOW() / current_timestamp()), because they change the query outcome and prevent reuse. Prefer explicit date/time parameters and keep query text stable so identical selections can be served from cache.
2) Reuse datasets and keep the query “shape” consistent
Cache and reuse improve dramatically when tiles share the same dataset logic and a consistent predicate / GROUP BY shape. When multiple visuals can be answered by the same backend query (or the same canonical dataset), you reduce the number of statements dispatched and increase cache hit rates.
3) Be mindful of identity and impersonation
Result cache reuse is most effective when the same query is executed under the same access context. If your setup uses impersonation per viewer, you may reduce cache reuse because results can’t always be shared safely across identities.
Best practice: where it’s acceptable from a security and governance perspective, prefer a shared execution identity for published dashboards (for example, a service principal / shared access context) so repeated views can benefit from cache reuse. If you must use per-user impersonation, compensate by maximizing dataset reuse and focusing on deterministic, parameterized common paths.
Serverless adds the remote result cache, which is shared workspace-wide, survives restarts, and is also used by ODBC/JDBC clients and the SQL Statement API. For dashboards with repeated opens and common filter paths, this often produces the biggest “free” performance win.
For published dashboards, add a schedule. Scheduled runs execute the dataset logic ahead of peak hours and populate caches, improving first paint and smoothing top-of-hour bursts.
Use:
Notes:
For benchmarking, only disable caching during controlled tests:
SET use_cached_result = false
…and re-enable it for real usage so dashboards benefit from caching in production.
Don’t mix unrelated workloads on the same BI warehouse. Dedicated Serverless BI warehouses per domain/workload help the remote cache fill with the repeated queries that matter for those dashboards.
For more details, see: https://www.databricks.com/blog/understanding-caching-databricks-sql-ui-result-and-disk-caches
This part focused on how dashboard design and interaction patterns shape performance before the warehouse and data are even involved. By reducing fan-out, optimizing first paint, and maximizing cache reuse, you can often unlock large gains without changing the underlying data. In second part, we complete the picture by going deeper into the platform: how to choose and size the right SQL warehouse, how data modeling and file layout affect scan efficiency, and how precomputation, materialization, and data types keep dashboards fast and stable as usage scales.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.