Implementing "break glass" access control in Databricks, similar to Azure Privileged Identity Management (PIM), requires creating a process where users operate with minimal/default permissions, but can temporarily elevate their privileges for critical tasks, outside of the normal CI/CD deployment flow. Here's how you can approach this, including best practices specifically tailored for Databricks, and a discussion on whether dual accounts or just-in-time (JIT) elevation is more suitable:
Recommended Approach for Break Glass in Databricks
1. Principle of Least Privilege by Default
-
By default, assign users to groups/roles that only have read or limited access in your QA and production workspaces.
-
Use SCIM (System for Cross-domain Identity Management) in Databricks, ideally with your IdP (like Azure AD, Okta) for group and user provisioning.
2. Break Glass/Elevated Access Workflow
Two Common Patterns:
A. Dual Accounts (Daily Driver + Admin)
-
Each user has a normal account for daily tasks (read-only in QA/Prod) and a separate account, e.g., "jsmith-breakglass" which is placed in a group with elevated permissions.
-
The break glass account credentials should be tightly controlled (maybe stored in a password manager or protected vault).
-
Use of the break glass account must go through a documented approval workflow, with logging and alerts on any usage.
B. Just-in-Time (JIT) Role Elevation (Recommended)
-
Users have only one account. Normal operations remain read-only.
-
For elevated access, users request temporary membership in a privileged group through a managed process (preferably via your IdP).
-
Integration with Azure AD PIM or similar tool (if using Azure Databricks Enterprise) allows for on-demand group assignment, audit logging, and automatic expiration of elevated access.
-
Manual process (if not using automated IdP tools): Admin temporarily adds user to an "Admins" group, time-boxed, with manual/remove procedures documented and monitored.
3. Implementing in Databricks
-
Use Unity Catalog for fine-grained access control across workspaces, assigning default roles and "break glass" groups as appropriate.
-
Set up Databricks groups for:
-
Daily users: assigned read-only permissions in the QA/Prod workspaces.
-
Break glass users: assigned admin or elevated roles, but group membership is gated via approval and logging.
-
All elevated actions should be auditable (use Databricks audit logs).
4. Approval and Monitoring
-
Store the approval process in a ticketing system (Jira, ServiceNow, etc.) or in an access management tool (like Azure AD PIM).
-
Monitor group membership changes and elevated access usage. Set alerts for any changes to break glass groups.
5. Emergency (Break Glass) Procedures
-
In case of emergency, ensure steps are documented:
-
Who can escalate access.
-
Who approves.
-
How logs are collected and reviewed after the fact.
-
How and when elevated rights are revoked.
Dual Accounts vs. JIT Elevation
| Aspect |
Dual Accounts |
Just-in-Time Elevation (PIM-style) |
| Usability |
Clunky, need to track multiple creds |
Seamless, single identity |
| Security |
Good, but shared creds risk (if mishandled) |
Excellent, no extra creds, better audit |
| Auditability |
Auditable, but easier to miss linkages |
Highly auditable, centralized |
| Automation |
Awkward, hard to automate |
Easy with IdP integration |
| Recommended for Databricks? |
Only if JIT tools unavailable |
Best practice if possible |
Summary and Recommendations
-
Best approach: Use JIT group membership elevation via your IdP, paired with Databricks access controls and audit logs.
-
Azure Databricks integrates well with Azure AD PIM; other IdPs (Okta, etc.) can also be used for temporary group assignment.
-
Fallback approach: Dual accounts, with admin accounts restricted and usage thoroughly audited, should only be used if JIT methods are unavailable.
-
Always require documented approval for break glass access, and audit all changes post-event.
-
Enforce read-only roles in QA and prod for all users outside the break glass group.
This will ensure you have strong operational security, full auditability, and minimal disruption to your CI/CD process while still allowing emergency/manual interventions when absolutely necessary.