3 weeks ago
Need advice: I'm building a data analysis service solution on top of DataBricks and need to protect it from unauthorized data leaks, specifically file downloads.
As far as I can tell, I need some sort of remote browser isolation (RBI).
Is this the correct technology?
Are there any alternatives?
What are the best, most reasonably priced vendors?
Thank you in advance!
Stas
2 weeks ago
Hi @staskh,
Got it. You need something that makes bulk leaks harder without fighting screenshots, phones, etc.
On the Catalog Explorer download button... today, if a user has READ VOLUME on a Unity Catalog volume, Catalog Explorer is explicitly designed to let them select files and click Download. There isn’t a separate UI switch to hide/disable that controls the way some Jupyter‑style file browsers let you do.
The practical pattern, if you want to avoid one‑click file downloads, is..
On the earlier endpoint/channel controls point, I meant..
In terms of RBI/VDI vendors.... Databricks doesn’t publish an official recommended vendor list for RBI/VDI. In practice, customers usually standardise on whatever is already blessed by their security/org stack (for example, Citrix / VMware Horizon / AVD / Amazon WorkSpaces on the VDI side, or Zscaler / Netskope / Cloudflare‑style secure web gateways on the RBI/DLP side). You can sync with your security architects to validate any specific vendor choices, but we don’t mandate a product‑specific shortlist.
If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.
2 weeks ago
Hi @staskh - Good question. Airlocking Databricks with RBI can easily become more complex than the actual risk you’re trying to manage.
If your end users need insights, don’t give them Databricks access at all. Run workloads via jobs (service principals), keep data in Unity Catalog, and expose only approved outputs via a separate app/BI layer. No workspace access means no download buttons to worry about.
If some users do need interactive access, focus on reducing blast radius, not perfect prevention. You can’t fully stop exfiltration (copy/paste, screenshots, photos). Technology can only raise the bar. Policy and monitoring do the rest.
RBI can be a last‑mile control in very high‑security environments. Still, for many deployments, you get more value (and less complexity) from strong permissions, job‑driven access patterns, and network controls rather than trying to airlock the Databricks UI itself.
If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.
2 weeks ago
In my situation, I am required to ensure that the end user has access to all data and capabilities within the Databricks environment while simultaneously preventing the download of large amounts of data.
The objective is to safeguard the core datasets, and screenshots and photos are both permissible and even expected.
I am certain that the "backend" airlock will be preserved by employing a virtual private cloud (VPC) with strictly limited egress points. At present, I am confronted with the task of determining an appropriate technology and solution for the frontend airlock.
Sincerely
STAS
2 weeks ago
Hi @staskh,
Thanks for the extra context. I’d flag that your requirement is a bit self‑contradictory...
end user has access to all data and capabilities within Databricks
while preventing download of large amounts of data
If a user can see all the data and use all capabilities (including notebooks, SQL, APIs), there’s no frontend technology... RBI included... that can guarantee they don’t exfiltrate large volumes. They can always script incremental reads, copy/paste, or do exactly what you already accept (screenshots/photos).
Given you’re already planning a strong backend airlock with VPC + controlled egress, I’d frame the frontend airlock requirement as "Not make exfiltration impossible, but make bulk exfiltration harder and more visible."
In practice, that usually means a mix of:
RBI can add friction to bulk download options in the browser, but because you already allow screenshots/photos, its incremental benefit is limited. I’d start by tightening permissions, limits, and monitoring, and only add RBI/VDI if your security team still feels the residual risk is too high.
If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.
2 weeks ago
I do not require an "NSA-level airlock." Indeed, a malicious actor could develop a script that projects scrolled data onto the screen and records it with an external device, or more effectively, they could create a series of QR code movies to address error correction. We can't prevent it unless we have a physical airlock (no electronic devices allowed, armed security guard, etc.).
I aspire to offer a practical level of data leak prevention, such as disabling the download of a file from Catalog Explorer with a single click, as illustrated below:
While "Jupiter" views allow for the disabling of similar downloads, it appears that catalog explorer lacks this capability.
Can you please clarify your point on
Additionally, I would be extremely grateful for any information regarding the RBI/VDI vendors that Dtabricks has recommended.
Regards,
Stas
2 weeks ago
Hi @staskh,
Got it. You need something that makes bulk leaks harder without fighting screenshots, phones, etc.
On the Catalog Explorer download button... today, if a user has READ VOLUME on a Unity Catalog volume, Catalog Explorer is explicitly designed to let them select files and click Download. There isn’t a separate UI switch to hide/disable that controls the way some Jupyter‑style file browsers let you do.
The practical pattern, if you want to avoid one‑click file downloads, is..
On the earlier endpoint/channel controls point, I meant..
In terms of RBI/VDI vendors.... Databricks doesn’t publish an official recommended vendor list for RBI/VDI. In practice, customers usually standardise on whatever is already blessed by their security/org stack (for example, Citrix / VMware Horizon / AVD / Amazon WorkSpaces on the VDI side, or Zscaler / Netskope / Cloudflare‑style secure web gateways on the RBI/DLP side). You can sync with your security architects to validate any specific vendor choices, but we don’t mandate a product‑specific shortlist.
If this answer resolves your question, could you mark it as “Accept as Solution”? That helps other users quickly find the correct fix.