cancel
Showing results for 
Search instead for 
Did you mean: 
Generative AI
Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.
cancel
Showing results for 
Search instead for 
Did you mean: 

Best Approach for Per-User Authentication with Databricks in an Agent-Based System

Vamsi3757
Visitor

Hello everyone,

I have built an ADK-based agent that connects to Databricks and can retrieve various information. However, I’m trying to design a secure way to authenticate each user individually when they interact with the agent.

So far, I see two possible approaches:

  1. Using Personal Access Tokens (PATs)

    • I would prefer to avoid this approach because it requires passing PAT tokens, and I don’t want to expose or manage user tokens within the agent.
  2. Using OAuth with a Service Principal

    • In this case, we need to use a client_id and client_secret.
    • However, this also involves passing sensitive credentials to the agent, which I would like to avoid for security reasons.

Given these constraints, I’m looking for guidance on:

Is there a recommended approach to securely authenticate each user individually in an agent-based architecture without exposing PATs or client secrets?

For example:

  • Are there patterns involving SSO, external identity providers, or backend-mediated authentication?
  • Has anyone implemented a per-user authentication model in a similar setup?

I’d really appreciate any suggestions or best practices from the community.

Thanks in advance!

2 REPLIES 2

sameer_yasser
New Contributor II

Great question the core principle is: credentials should never travel through the agent itself. 

Instead of PATs or embedding a client_secret in your agent, use your Identity Provider (Entra ID / Okta) with the OBO flow:

User authenticates via your IdP → receives a short-lived JWT
Your backend exchanges that token for a Databricks-scoped token using OBO
Databricks sees the actual user identity — not a shared service principal

python# Backend only — never inside agent logic
obo_token = msal_app.acquire_token_on_behalf_of(
user_assertion=user_aad_token,
scopes=["2ff814a6-3304-4ab8-85cb-cd0e6f879c1d/.default"] # Databricks resource ID
)
databricks_token = obo_token["access_token"]
Your agent receives only a short-lived, scoped token injected at request time  no secrets stored anywhere in the agent.

ShamenParis
New Contributor II

Hello @Vamsi3757 , This is an excellent architecture question. You are completely right to avoid PATs (due to lifecycle/security risks) and Service Principals (which mask individual user identities and make Unity Catalog audit logging difficult).

To securely authenticate each user individually, the recommended best practice in Databricks is to use OAuth User-to-Machine (U2M) Authentication.

Depending on where your agent runs, you will use one of two standard OAuth patterns. You configure both in the Databricks Account Console under Settings -> App connections -> Add connection.

ShamenParis_0-1780007227542.png

 

The Public Client Pattern (For Local Agents / CLIs)

If your agent runs locally on the user's machine (like a Python script or desktop app), you should use the OAuth 2.0 Authorization Code Flow with PKCE. This opens a browser window for the user to log in via standard SSO, then securely returns a token to the agent.

  • How to configure it:

    • Name: Give it a clear name (e.g., "Local AI Agent").

    • Redirect URLs: Set this to your local callback port (e.g., http://localhost:8080/callback).

    • Access scopes: Select SQL if your agent only queries data via Databricks SQL. Select all apis if your agent needs to interact with workspace assets (Notebooks, Jobs, Clusters, etc.).

    • Generate a client secret: Leave this UNCHECKED. This explicitly tells Databricks it is a Public Client using PKCE, meaning your agent's code never needs to store or pass a sensitive static secret.

    • Access/Refresh Token TTL: The defaults (60 mins / 10080 mins) are usually perfect.

The Backend-Mediated Pattern (For Web App Agents)

If your agent is hosted as a web application (e.g., a React frontend with a Python backend), you should use a Confidential Client. The user logs into the frontend, and your secure backend brokers the Databricks OAuth token.

  • How to configure it:

    • Redirect URLs: Set this to your backend's OAuth callback endpoint (e.g., https://yourapp.com/api/callback).

    • Generate a client secret: CHECK this box. Because your backend is a secure server, it can safely hold the generated client ID and client secret to facilitate the token exchange. The frontend never sees these secrets.

By using Databricks native OAuth, your agent acts exactly on behalf of the logged-in user. All Unity Catalog row-level and column-level security policies will apply perfectly, and your Databricks Audit Logs will reflect the actual human user!

If you are using Python, the Databricks SDK handles the token refresh cycles and OAuth callbacks natively for both of these setups, meaning you rarely have to manage the raw tokens yourself. Hope this helps!