cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Guidance on Managing Databricks Apps Whitelisting

Chinu
New Contributor III

Hi,

Our workspace users currently have permissions to create Databricks apps, and weโ€™ve observed a rise in associated costs. To address this, weโ€™re developing an administrative notebook to:

  1. Check whether an app is whitelisted/approved from the admins.

  2. Automatically remove apps that arenโ€™t approved.

Weโ€™re unsure how to implement a whitelisting mechanism for apps. Could you share any recommendations or best practices for managing app whitelisting in Databricks?

Thank you for your help!

1 ACCEPTED SOLUTION

Accepted Solutions

Louis_Frolio
Databricks Employee
Databricks Employee

Here are some helpful tips that you might find useful:

 

Summary of Best Practices and Recommendations for App Whitelisting and Automated App Removal in Databricks

1. Overview and Whitelisting Strategy

To control costs and maintain governance over Databricks app usage in your workspace, the recommended approach is to implement a clear app whitelisting mechanism and automate removal of unapproved apps. The following best practices and guidance summarize the official recommendations and field experience for Databricks Apps.
What is "Whitelisting" in this Context? A "whitelist" is a defined set of Databricks apps that have been reviewed and approved by administrators for use in the workspace. Only apps on this list should be allowed to exist; all others should be flagged and optionally removed.

2. Storage and Management of the Whitelist

There are several practical options for maintaining your workspace's app whitelist:
  • Workspace Table or Delta Table: Store a table with app names, owners, approval status, and other metadata. This can be referenced by an administrative notebook for checks and reporting.
  • Configuration File: Use a workspace file (YAML, JSON, etc.) with the approved app names.
  • Secrets: If you need to store sensitive information (like app IDs tied to privileged resources), you could store whitelist details in Databricks Secrets.
  • Unity Catalog Table: For larger environments, a Unity Catalog managed table shared with admins for central control is ideal.
Choose the storage method that best fits your operational security requirements and maintainability preferences.

3. Automated Enforcement Workflow

A robust administrative notebook should:
  • Enumerate all current apps in the workspace: Use the official Databricks APIs or SDK to list all app resources.
  • Compare with the whitelist: Cross-reference the list of existing apps with your approved whitelist.
  • Flag or Remove unapproved apps:
    • Unapproved apps should be reported or, if desired, automatically removed.
    • Build in logging/audit capabilities and a dry-run mode to help non-disruptively validate changes.
Here is a typical control flow:
 
```python
# Pseudocode outline (Python-based, can be adapted to Scala/Spark)
approved_apps = load_whitelist() # Load whitelist from table, secret, or file current_apps = databricks_admin_api.list_apps() # List all workspace apps
 
for app in current_apps:
   if app.name not in approved_apps:
      # Optional: log, notify, or tag before removal
         databricks_admin_api.remove_app(app.id) # Remove unapproved app ``` Be sure to handle exceptions, permissions, and edge cases (apps in transient state, or deployed by critical users) as needed.

4. Permissions and Governance Recommendations

To enforce governance and prevent unwanted apps from being created: - Restrict โ€œCAN MANAGEโ€ app permission: Only grant this to trusted administrators or peer-reviewed senior developers. - Restrict โ€œCAN USEโ€ permission to only those groups or users who need access to a given app. - For OBO (on-behalf-of) apps, only enable this feature in trusted environments with peer-reviewed code and restrict additional scopes to the minimum needed.

5. Security, Auditing, and Compliance

Make use of Databricks audit logs: - Track permission changes on apps and who approved or made changes to the whitelist. - Setup workflows to log all admin actions, app creation, sharing, and deletion for compliance audits.

6. Environments and Promotion

  • Maintain separate whitelists for dev, staging, and production environments.
  • Use CI/CD and Databricks Asset Bundles (DABs) to promote only approved apps between environments.

7. Additional Best Practices

  • Regularly review the whitelist and app logs to ensure consistency and compliance.
  • Periodically audit installed apps to review cost and usage patterns.
  • Isolate apps by workspaces/environments where appropriate to reduce risk surface.
  • Document and peer-review all changes to app permissions and whitelist entries.
  • Maintain least privilege both on OAuth scopes requested by apps, as well as Databricks resource permissions for app service principals.

Table: Implementation Checklist

Action Recommended Practice
Whitelist Storage Workspace table, UC table, config file, or secret
Enumerate Apps Use Databricks REST API or SDK
Compare and Log Discrepancies Cross-reference with whitelist and log/messaging
Remove Unapproved Apps Automated via admin notebook or DABs
Governance Controls Restrict CAN MANAGE and CAN USE rights
Audit and Review Use Databricks audit logs and periodic reviews
Promotion Across Environments Deploy approved apps via CI/CD and DABs
Documentation and Peer Review Require for changes to whitelist or app access
Ongoing Security Assessment Utilize Databricks security center best practices

Example Policy Logic

  • Allow only whitelisted apps: Only apps listed in your whitelist are allowed to run or be present in the workspace.
  • Alert or auto-remove all others: For any app detected that's not on the whitelist, admins are alerted, optionally with automatic removal.
  • Restrict app modifications: Only those with "CAN MANAGE" access may modify or approve changes to the whitelist.

 

View solution in original post

1 REPLY 1

Louis_Frolio
Databricks Employee
Databricks Employee

Here are some helpful tips that you might find useful:

 

Summary of Best Practices and Recommendations for App Whitelisting and Automated App Removal in Databricks

1. Overview and Whitelisting Strategy

To control costs and maintain governance over Databricks app usage in your workspace, the recommended approach is to implement a clear app whitelisting mechanism and automate removal of unapproved apps. The following best practices and guidance summarize the official recommendations and field experience for Databricks Apps.
What is "Whitelisting" in this Context? A "whitelist" is a defined set of Databricks apps that have been reviewed and approved by administrators for use in the workspace. Only apps on this list should be allowed to exist; all others should be flagged and optionally removed.

2. Storage and Management of the Whitelist

There are several practical options for maintaining your workspace's app whitelist:
  • Workspace Table or Delta Table: Store a table with app names, owners, approval status, and other metadata. This can be referenced by an administrative notebook for checks and reporting.
  • Configuration File: Use a workspace file (YAML, JSON, etc.) with the approved app names.
  • Secrets: If you need to store sensitive information (like app IDs tied to privileged resources), you could store whitelist details in Databricks Secrets.
  • Unity Catalog Table: For larger environments, a Unity Catalog managed table shared with admins for central control is ideal.
Choose the storage method that best fits your operational security requirements and maintainability preferences.

3. Automated Enforcement Workflow

A robust administrative notebook should:
  • Enumerate all current apps in the workspace: Use the official Databricks APIs or SDK to list all app resources.
  • Compare with the whitelist: Cross-reference the list of existing apps with your approved whitelist.
  • Flag or Remove unapproved apps:
    • Unapproved apps should be reported or, if desired, automatically removed.
    • Build in logging/audit capabilities and a dry-run mode to help non-disruptively validate changes.
Here is a typical control flow:
 
```python
# Pseudocode outline (Python-based, can be adapted to Scala/Spark)
approved_apps = load_whitelist() # Load whitelist from table, secret, or file current_apps = databricks_admin_api.list_apps() # List all workspace apps
 
for app in current_apps:
   if app.name not in approved_apps:
      # Optional: log, notify, or tag before removal
         databricks_admin_api.remove_app(app.id) # Remove unapproved app ``` Be sure to handle exceptions, permissions, and edge cases (apps in transient state, or deployed by critical users) as needed.

4. Permissions and Governance Recommendations

To enforce governance and prevent unwanted apps from being created: - Restrict โ€œCAN MANAGEโ€ app permission: Only grant this to trusted administrators or peer-reviewed senior developers. - Restrict โ€œCAN USEโ€ permission to only those groups or users who need access to a given app. - For OBO (on-behalf-of) apps, only enable this feature in trusted environments with peer-reviewed code and restrict additional scopes to the minimum needed.

5. Security, Auditing, and Compliance

Make use of Databricks audit logs: - Track permission changes on apps and who approved or made changes to the whitelist. - Setup workflows to log all admin actions, app creation, sharing, and deletion for compliance audits.

6. Environments and Promotion

  • Maintain separate whitelists for dev, staging, and production environments.
  • Use CI/CD and Databricks Asset Bundles (DABs) to promote only approved apps between environments.

7. Additional Best Practices

  • Regularly review the whitelist and app logs to ensure consistency and compliance.
  • Periodically audit installed apps to review cost and usage patterns.
  • Isolate apps by workspaces/environments where appropriate to reduce risk surface.
  • Document and peer-review all changes to app permissions and whitelist entries.
  • Maintain least privilege both on OAuth scopes requested by apps, as well as Databricks resource permissions for app service principals.

Table: Implementation Checklist

Action Recommended Practice
Whitelist Storage Workspace table, UC table, config file, or secret
Enumerate Apps Use Databricks REST API or SDK
Compare and Log Discrepancies Cross-reference with whitelist and log/messaging
Remove Unapproved Apps Automated via admin notebook or DABs
Governance Controls Restrict CAN MANAGE and CAN USE rights
Audit and Review Use Databricks audit logs and periodic reviews
Promotion Across Environments Deploy approved apps via CI/CD and DABs
Documentation and Peer Review Require for changes to whitelist or app access
Ongoing Security Assessment Utilize Databricks security center best practices

Example Policy Logic

  • Allow only whitelisted apps: Only apps listed in your whitelist are allowed to run or be present in the workspace.
  • Alert or auto-remove all others: For any app detected that's not on the whitelist, admins are alerted, optionally with automatic removal.
  • Restrict app modifications: Only those with "CAN MANAGE" access may modify or approve changes to the whitelist.