Databricks Community

Tarzi-Simon

Dashboard performance issues rarely come from a single place. They’re usually the combined effect of dashboard design, warehouse concurrency and caching and data layout in your lakehouse. If you optimize only one layer—SQL, or compute sizing, or table layout—you’ll often see partial wins, but the dashboard can still feel slow or unpredictable under real usage.

In this post, we take a holistic approach to Databricks AI/BI performance. We’ll follow a dashboard interaction end-to-end: from the browser and AI/BI orchestration layer, through Databricks SQL admission and caching behavior, down to file scanning and data skipping in the Lakehouse. Along the way, we’ll highlight the patterns that most often drive latency spikes, queueing, and cost at scale—especially when many users interact with the same dashboards concurrently.

The anatomy of an AI/BI Dashboard refresh

To optimize performance, you must first understand the journey a single click takes through the stack. When a user opens a dashboard or changes a filter, a chain reaction occurs across multiple layers. If any layer is misconfigured, the user feels the lag.

The Browser (Client-Side): This is the first line of defense. For datasets under 100,000 rows and under 100MB, the browser acts as a local engine, handling field filters and cross-chart interactions instantly in memory. If your data exceeds this threshold, every interaction must travel back to the warehouse.
The Dashboard Design (The Orchestrator): The AI/BI service determines which queries need to fire. A "single-page" design sends every widget’s query simultaneously, creating a massive concurrency spike. A "multi-page" design only requests data for the visible tab, effectively shaping the demand on your compute.
Databricks SQL (The Engine): Your SQL Warehouse (ideally Serverless) receives the burst. It checks the Cache - which have multiple layers - to see if the work has already been done. If not, the Intelligent Workload Management (IWM) admits the query, autoscaling clusters in seconds to handle the load without queuing.
The Lakehouse (The Storage): Finally, the engine hits the data. It scans Delta files in Cloud Object Storage. Here, Liquid Clustering and data types determine the I/O efficiency. The goal is to skip as much data as possible using file-level statistics and metadata to return the result set back up the chain.

By optimizing each of these four touchpoints, you move away from brute-force compute and toward a streamlined architecture that scales with your users.

Prerequisite - understand your data and your dashboard

Before optimizing anything, you must first define what you are optimizing for. Dashboard performance is not a single concept, and improvements only make sense when tied to a clear target. Common goals include reducing time to first visual, improving interaction latency, keeping performance stable under concurrency, or lowering the cost per dashboard view.

Once the goal is clear, you need to understand the parameters that shape it. These include the size and growth of the data, the number of users and their access patterns, and how queries behave in practice—how many fires on page load, how much data they scan, and whether results are reused or constantly recomputed. Without this context, optimization becomes guesswork and often shifts cost or latency from one layer to another.

Effective dashboard optimization is, therefore, intentional: pick a measurable target, understand the data and usage patterns that influence it, and only then apply the technical optimizations that follow.

Optimization #1: Organize the dashboard into pages (tabs)

Every visible tile is a potential trigger: it runs on first load and can re-run when filters/parameters change, on refresh, and when users navigate back to a page. Tabs limit those re-executions to the active page, reducing bursts and head-of-line blocking.

AI/BI dashboards let you build multi‑page reports. Group visuals into pages aligned to user intent (Overview → Investigate → Deep dive), so only the current page executes. This reduces head‑of‑line blocking, shapes concurrency into smaller bursts, and increases cache hit rates for repeated deterministic queries.

Recommended page types:

Overview: fast counters and trendlines for first paint, keep heavy joins/windows off the landing page.
Investigate: entity‑focused exploration (customer/product/region) with filters that push predicates into SQL (parameters) when you need pre‑aggregation reduction.
Deep dive: expensive aggregations powered by scheduled refresh or materialized/metric views (you can export a dashboard dataset to a materialized view).

Favor deterministic tiles (avoid NOW()) to maximize result cache hits, monitor Peak Queued Queries and increase cluster size or max clusters if persistently > 0.

The drill-through feature in AI/BI Dashboards enables navigation from high-level visuals to detailed pages while carrying the selected context. It is a useful strategy to enforce a page-based design by deferring expensive queries until user intent is clear, improving first-paint performance and reducing unnecessary concurrency spikes.

Callout — Why this helps on any warehouse type: Smaller, predictable bursts make Serverless IWM react fast and avoid over-scaling, and they prevent Pro/Classic from saturating cluster slots during page loads.

For more details, see: https://www.databricks.com/blog/whats-new-in-aibi-dashboards-fall24

Optimization #2: Optimize the "First Paint" with Smart Defaults

The first impression of a dashboard is defined by its first paint—the time it takes from opening the dashboard to seeing meaningful results. Default filter values play a critical role here because they determine which queries run immediately on page load and how much data those queries must process.

When filters have no defaults, AI/BI dashboards often load the entire dataset on first open. This maximizes scan volume, increases query fan-out across tiles, and delays the moment when users see usable insights. The result is a slow, unpredictable first experience—especially painful during peak concurrency when many users open the same dashboard.

Smart defaults fix this by constraining the initial query shape. Common examples include:

Defaulting date filters to a recent window (for example, last 7 or 30 days).
Preselecting a common region, business unit, or top-level entity.
Choosing a sensible “All” that is still selective (for example, current fiscal year instead of all history).

Technically, default filters reduce the amount of data scanned, improve cache hit rates, and allow result reuse across users who open the dashboard with the same initial state. This directly improves time-to-first-visual and makes performance far more consistent.

The key design principle is simple: optimize for the landing experience. Make the first paint fast and informative, then let users broaden the scope intentionally as they explore. A fast first paint builds trust, encourages adoption, and sets the performance baseline for every interaction that follows.

For more details, see: https://docs.databricks.com/aws/en/dashboards/filters#-set-default-filter-values

Optimization #3: Use parameters to slice large datasets

Parameters are one of the most effective ways to scale AI/BI dashboards because they shape the query before it runs. By injecting values directly into the SQL, they push predicates down early—so Databricks can prune data sooner and do far less work per query (ideally filtering before joins and aggregations).

Field filters behave differently. If the underlying dataset is small enough to be cached in the browser (≤ 100K rows and ≤ 100MB), field filters and cross-filter interactions can be evaluated client-side with no warehouse round-trip. Once datasets exceed that threshold, field filters are typically pushed to the warehouse by wrapping the dataset query (often via a CTE), which triggers backend SQL execution and may not reduce scan cost as effectively as parameterized predicates applied before joins and aggregation.

Parameters are especially effective for slicing by date ranges, regions, business units, or other dimensions that significantly reduce data volume. They also make performance more predictable: each tile executes cheaper queries, and concurrency spikes become easier for the warehouse to absorb.

There is a trade-off. Because different parameter values produce different query signatures, cache reuse can be lower when users constantly choose unique values. In practice, this is usually the right compromise: a small, cheap query without a cache hit is far better than a large, expensive query that depends on caching to survive. You can further balance this by using sensible defaults and a limited set of common parameter values so popular paths still benefit from the result cache.

A practical rule of thumb is: keep datasets small enough to fit in the browser cache and use field filters for interactivity (best case: no warehouse round-trips). If you’re not confident the dataset will reliably stay within the browser-cache limits, use parameters to reduce the dataset early—so the warehouse reads less data up front—and then apply filters for deeper exploration. This turns “scan everything and filter later” dashboards into selective, scalable queries that stay fast as data and users grow.

For more details see:

https://medium.com/@andrea0pica/databricks-ai-bi-caching-explained-0c4de1aa946b

Optimization #4: Use the browser cache

Browser caching helps most when interactions are field-filter based on small datasets. Once parameters modify SQL—or datasets exceed the browser threshold—interactions shift back to warehouse execution.

Databricks AI/BI dashboards can cache dataset results directly in the user’s browser, allowing many interactions to be handled entirely client-side without round-trips to the SQL warehouse.

When a dataset is below roughly 100,000 rows, the browser becomes a local execution engine. Cross-filtering, sorting, and simple aggregations can be resolved instantly in memory, producing near-zero-latency interactions and eliminating backend concurrency pressure altogether. This is why well-designed overview pages often feel “instant,” even under heavy usage.

The browser cache is automatically used when:

The dataset is small enough to be safely loaded into the browser.
Interactions are based on field filters or cross-chart selections that can be evaluated client-side.
No parameter change forces a SQL re-execution on the warehouse.

Once datasets grow beyond this threshold—or when parameters modify the underlying SQL—the interaction is pushed back to the warehouse and browser caching no longer applies.

The key design takeaway is intentional dataset sizing. Keep landing-page and overview datasets compact, pre-aggregated, and focused on common KPIs so they qualify for browser caching. This delivers instant “first paint,” reduces query fan-out, and preserves warehouse capacity for deeper investigative pages where backend execution is unavoidable.

For more details,see: https://docs.databricks.com/aws/en/dashboards/caching#dataset-optimizations

Optimization #5: Maximize the usage of the result cache

The cheapest, fastest query is the one you don’t execute.

Databricks SQL result caching turns repeated dashboard interactions into near-instant responses by serving results from cache instead of recomputing them.

How result caching works (what’s cached, where, and for how long)

Databricks SQL checks cached results before executing a query:

Local result cache (all warehouses): per-cluster cache.
Remote result cache (Serverless): workspace-wide and persists across warehouse restarts.

Both caches have a roughly 24-hour TTL and are invalidated when the underlying tables change, so you keep freshness while benefiting from reuse.

Design dashboards so cache hits are likely

Cache hits are not automatic — they’re a design outcome. You get the most value when many users ask the warehouse the same question in the same way.

1) Make tiles deterministic
Avoid non-deterministic functions (e.g., NOW() / current_timestamp()), because they change the query outcome and prevent reuse. Prefer explicit date/time parameters and keep query text stable so identical selections can be served from cache.

2) Reuse datasets and keep the query “shape” consistent
Cache and reuse improve dramatically when tiles share the same dataset logic and a consistent predicate / GROUP BY shape. When multiple visuals can be answered by the same backend query (or the same canonical dataset), you reduce the number of statements dispatched and increase cache hit rates.

3) Be mindful of identity and impersonation
Result cache reuse is most effective when the same query is executed under the same access context. If your setup uses impersonation per viewer, you may reduce cache reuse because results can’t always be shared safely across identities.
Best practice: where it’s acceptable from a security and governance perspective, prefer a shared execution identity for published dashboards (for example, a service principal / shared access context) so repeated views can benefit from cache reuse. If you must use per-user impersonation, compensate by maximizing dataset reuse and focusing on deterministic, parameterized common paths.

Prefer Serverless for repeat interactions

Serverless adds the remote result cache, which is shared workspace-wide, survives restarts, and is also used by ODBC/JDBC clients and the SQL Statement API. For dashboards with repeated opens and common filter paths, this often produces the biggest “free” performance win.

Warm caches proactively

For published dashboards, add a schedule. Scheduled runs execute the dataset logic ahead of peak hours and populate caches, improving first paint and smoothing top-of-hour bursts.

Verify hits (and avoid misleading benchmarks)

Use:

Query History: from_result_cache and cache_origin_statement_id
EXPLAIN EXTENDED to understand cache eligibility

Notes:

Local cache won’t insert results larger than ~500 MiB (remote cache has no size restriction).
Remote cache requires clients with Cloud Fetch support (older drivers may miss).

For benchmarking, only disable caching during controlled tests:
SET use_cached_result = false
…and re-enable it for real usage so dashboards benefit from caching in production.

Operational hygiene: avoid cache pollution

Don’t mix unrelated workloads on the same BI warehouse. Dedicated Serverless BI warehouses per domain/workload help the remote cache fill with the repeated queries that matter for those dashboards.

For more details, see: https://www.databricks.com/blog/understanding-caching-databricks-sql-ui-result-and-disk-caches

Outlook: Part 2

This part focused on how dashboard design and interaction patterns shape performance before the warehouse and data are even involved. By reducing fan-out, optimizing first paint, and maximizing cache reuse, you can often unlock large gains without changing the underlying data. In second part, we complete the picture by going deeper into the platform: how to choose and size the right SQL warehouse, how data modeling and file layout affect scan efficiency, and how precomputation, materialization, and data types keep dashboards fast and stable as usage scales.

Databricks Community

The Top 10 Best Practices for AI/BI Dashboards Performance Optimization (Part 1)