cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Governance
Join discussions on data governance practices, compliance, and security within the Databricks Community. Exchange strategies and insights to ensure data integrity and regulatory compliance.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Audit Access Rights

Databricks1126
New Contributor

We have a large Databricks instance, and we are performing a technical audit of Databricks to identify (1) the full list of users, service principals, and groups; (2) the full list of objects (e.g. catalogs, schemas, jobs, notebooks, etc.); and (3) the access levels of the users, service principals, and groups to those objects.

Here are the specific asks:

  • What are all the universe of โ€˜objectsโ€™ on Databricks that users can create and use to transform data? (e.g. catalogs, schemas, jobs, notebooks, etc.)
  • Are there hierarchal access relationships between these objects? For example, access to Object A gives you access to Object B and Object C.
  • How can we pull this information programmatically from Databricks?
1 REPLY 1

WiliamRosa
Contributor

Hi @Databricks1126,

I understand that youโ€™re looking to capture permissions across a wide variety of Databricks objects. These can generally be grouped into three main categories:

- Data objects (Unity Catalogโ€“governed) โ€“ catalogs, schemas, tables, views, volumes, functions, models.
- Workspace objects (compute / code / workflow) โ€“ jobs, notebooks, repos, pipelines, SQL warehouses, dashboards.
- Identity / configuration objects โ€“ users, service principals, groups, secrets, clusters, instance pools.

Because this is quite a broad universe, a good first step for such an audit is to use the Databricks REST API. The official reference is here:
https://docs.databricks.com/api/workspace/introduction

For example, you can start by retrieving the full list of workspace users via the SCIM API, and then for each user (by ID or email) check their associated permissions:

import requests, json

host = spark.conf.get("spark.databricks.workspaceUrl")
token = dbutils.secrets.get("my-scope", "DATABRICKS_TOKEN")

# List all users
url = f"https://{host}/api/2.0/preview/scim/v2/Users"
resp = requests.get(url, headers={"Authorization": f"Bearer {token}"})
resp.raise_for_status()
data = resp.json()

for user in data.get("Resources", []):
    print(user["id"], user["userName"], user.get("displayName"))

# Lookup a specific user by email
user_email = "user@test.com"
url = f"https://{host}/api/2.0/preview/scim/v2/Users?filter=userName eq \"{user_email}\""
resp = requests.get(url, headers={"Authorization": f"Bearer {token}"})
print(json.dumps(resp.json(), indent=2))

The response is a SCIM User document (e.g., id, userName, displayName, groups, entitlements, โ€ฆ).
Hope that helps!

Wiliam Rosa
Data Engineer | Machine Learning Engineer
LinkedIn: linkedin.com/in/wiliamrosa

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now