Databricks Community

staskh · ‎03-08-2026

Need advice: I'm building a data analysis service solution on top of DataBricks and need to protect it from unauthorized data leaks, specifically file downloads.
As far as I can tell, I need some sort of remote browser isolation (RBI).

Is this the correct technology?
Are there any alternatives?
What are the best, most reasonably priced vendors?

Thank you in advance!

Stas

Ashwin_DSA · ‎03-08-2026

Hi @staskh,

Got it. You need something that makes bulk leaks harder without fighting screenshots, phones, etc.

On the Catalog Explorer download button... today, if a user has READ VOLUME on a Unity Catalog volume, Catalog Explorer is explicitly designed to let them select files and click Download. There isn’t a separate UI switch to hide/disable that controls the way some Jupyter‑style file browsers let you do.

The practical pattern, if you want to avoid one‑click file downloads, is..

Don’t grant business users access to volumes at all (no READ VOLUME), and
Expose data only as tables/views via Unity Catalog, where you can:
- Limit what they see (row/column security, views).
- Control/disable result downloads in downstream tools. For example, the SQL editor has an admin control that can disable downloads entirely for the workspace.

On the earlier endpoint/channel controls point, I meant..

DLP/CASB: A gateway or endpoint agent that inspects traffic and either blocks or flags patterns like "user just downloaded a 3 GB CSV from Databricks" or "uploaded a large file to a personal SaaS app".
Rate‑limiting/size limits: Use the built‑in limits (e.g., max download sizes in SQL/Genie) plus your own rules (views that aggregate or cap result sizes) so users can’t casually pull full‑fidelity history in one go.
VDI/RDS: Put Databricks behind a virtual desktop (Citrix, VMware Horizon, Azure Virtual Desktop, Amazon WorkSpaces, etc.) and lock down that desktop (no local drives, restricted clipboard/printing). That way, even if the UI offers "Download", the data is landing in a tightly controlled environment, not directly on a personal laptop.

In terms of RBI/VDI vendors.... Databricks doesn’t publish an official recommended vendor list for RBI/VDI. In practice, customers usually standardise on whatever is already blessed by their security/org stack (for example, Citrix / VMware Horizon / AVD / Amazon WorkSpaces on the VDI side, or Zscaler / Netskope / Cloudflare‑style secure web gateways on the RBI/DLP side). You can sync with your security architects to validate any specific vendor choices, but we don’t mandate a product‑specific shortlist.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

View solution in original post

Ashwin_DSA · ‎03-08-2026

Hi @staskh - Good question. Airlocking Databricks with RBI can easily become more complex than the actual risk you’re trying to manage.

If your end users need insights, don’t give them Databricks access at all. Run workloads via jobs (service principals), keep data in Unity Catalog, and expose only approved outputs via a separate app/BI layer. No workspace access means no download buttons to worry about.

If some users do need interactive access, focus on reducing blast radius, not perfect prevention. You can’t fully stop exfiltration (copy/paste, screenshots, photos). Technology can only raise the bar. Policy and monitoring do the rest.

Use Unity Catalog with tight table/column/row permissions.
Prefer Databricks SQL/dashboards and control export there.
Lock down network and egress (Private Link/VNet, firewalls).

RBI can be a last‑mile control in very high‑security environments. Still, for many deployments, you get more value (and less complexity) from strong permissions, job‑driven access patterns, and network controls rather than trying to airlock the Databricks UI itself.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

staskh · ‎03-08-2026

In my situation, I am required to ensure that the end user has access to all data and capabilities within the Databricks environment while simultaneously preventing the download of large amounts of data.

The objective is to safeguard the core datasets, and screenshots and photos are both permissible and even expected.

I am certain that the "backend" airlock will be preserved by employing a virtual private cloud (VPC) with strictly limited egress points. At present, I am confronted with the task of determining an appropriate technology and solution for the frontend airlock.

Sincerely

STAS

Ashwin_DSA · ‎03-08-2026

Hi @staskh,

Thanks for the extra context. I’d flag that your requirement is a bit self‑contradictory...

end user has access to all data and capabilities within Databricks
while preventing download of large amounts of data

If a user can see all the data and use all capabilities (including notebooks, SQL, APIs), there’s no frontend technology... RBI included... that can guarantee they don’t exfiltrate large volumes. They can always script incremental reads, copy/paste, or do exactly what you already accept (screenshots/photos).

Given you’re already planning a strong backend airlock with VPC + controlled egress, I’d frame the frontend airlock requirement as "Not make exfiltration impossible, but make bulk exfiltration harder and more visible."

In practice, that usually means a mix of:

Databricks‑side controls: strict authZ (Unity Catalog), sensible query/result size limits, and audit/monitoring for unusual volumes.
Endpoint/channel controls: DLP, rate‑limiting, possibly VDI/RDS for the user session if you need extra assurance.

RBI can add friction to bulk download options in the browser, but because you already allow screenshots/photos, its incremental benefit is limited. I’d start by tightening permissions, limits, and monitoring, and only add RBI/VDI if your security team still feels the residual risk is too high.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

staskh · ‎03-08-2026

I do not require an "NSA-level airlock." Indeed, a malicious actor could develop a script that projects scrolled data onto the screen and records it with an external device, or more effectively, they could create a series of QR code movies to address error correction. We can't prevent it unless we have a physical airlock (no electronic devices allowed, armed security guard, etc.).

I aspire to offer a practical level of data leak prevention, such as disabling the download of a file from Catalog Explorer with a single click, as illustrated below:

While "Jupiter" views allow for the disabling of similar downloads, it appears that catalog explorer lacks this capability.

Can you please clarify your point on

Endpoint/channel controls: DLP, rate‑limiting, possibly VDI/RDS for the user session if you need extra assurance.

Additionally, I would be extremely grateful for any information regarding the RBI/VDI vendors that Dtabricks has recommended.

Regards,

Stas

Ashwin_DSA · ‎03-08-2026

Hi @staskh,

Got it. You need something that makes bulk leaks harder without fighting screenshots, phones, etc.

On the Catalog Explorer download button... today, if a user has READ VOLUME on a Unity Catalog volume, Catalog Explorer is explicitly designed to let them select files and click Download. There isn’t a separate UI switch to hide/disable that controls the way some Jupyter‑style file browsers let you do.

The practical pattern, if you want to avoid one‑click file downloads, is..

Don’t grant business users access to volumes at all (no READ VOLUME), and
Expose data only as tables/views via Unity Catalog, where you can:
- Limit what they see (row/column security, views).
- Control/disable result downloads in downstream tools. For example, the SQL editor has an admin control that can disable downloads entirely for the workspace.

On the earlier endpoint/channel controls point, I meant..

DLP/CASB: A gateway or endpoint agent that inspects traffic and either blocks or flags patterns like "user just downloaded a 3 GB CSV from Databricks" or "uploaded a large file to a personal SaaS app".
Rate‑limiting/size limits: Use the built‑in limits (e.g., max download sizes in SQL/Genie) plus your own rules (views that aggregate or cap result sizes) so users can’t casually pull full‑fidelity history in one go.
VDI/RDS: Put Databricks behind a virtual desktop (Citrix, VMware Horizon, Azure Virtual Desktop, Amazon WorkSpaces, etc.) and lock down that desktop (no local drives, restricted clipboard/printing). That way, even if the UI offers "Download", the data is landing in a tightly controlled environment, not directly on a personal laptop.

In terms of RBI/VDI vendors.... Databricks doesn’t publish an official recommended vendor list for RBI/VDI. In practice, customers usually standardise on whatever is already blessed by their security/org stack (for example, Citrix / VMware Horizon / AVD / Amazon WorkSpaces on the VDI side, or Zscaler / Netskope / Cloudflare‑style secure web gateways on the RBI/DLP side). You can sync with your security architects to validate any specific vendor choices, but we don’t mandate a product‑specific shortlist.

If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.

Regards,
Ashwin | Delivery Solution Architect @ Databricks
Helping you build and scale the Data Intelligence Platform.
***Opinions are my own***

Databricks Community

Advise on "airlocking" Databricks service

Solution Accelerator Series | Digital Twins

Community Alert: Free BrickTalk on Supply Chain Management with Databricks!

🌟 Community Pulse: Your Weekly Roundup! April 20 – 26, 2026

The Lakebase Hub: Official Community Space for Lakebase Insights

Take Control: Customer-Managed Keys for Lakebase Postgres