cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Generative AI
Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Documentation on all ways to access agent serving endpoint from outside databricks

Rajat-TVSM
New Contributor III

Struggling to find clear documentation which can help me with the subject. Need to know all the ways (production best practices) along with API method. As far as I know, using PAT is not a production best practice

4 REPLIES 4

Gecofer
Contributor II

Hi @Rajat-TVSM 

Youโ€™re absolutely right that Personal Access Tokens (PATs) are not considered a production best practice. For accessing Agent / Model Serving endpoints from outside Databricks, the recommended and supported approach for production is:

Service Principal authentication (OAuth-based)

  • Create a Service Principal
  • Grant it permissions on the serving endpoint
  • Authenticate using short-lived OAuth tokens
  • Call the Databricks Serving REST API from external systems

This approach provides proper security, token rotation, and governance, and is suitable for production workloads, CI/CD pipelines, and external applications.

PATs should be limited to development or proof-of-concept use cases only.

Optionally, for more enterprise-grade setups, an AI Gateway can be used in front of the serving endpoint to centralize authentication, rate limiting, and observability.

Hope this helps clarify the recommended production setup.

 

Gema.

Rajat-TVSM
New Contributor III

Hi Gecofer/Gema,

I was looking for the documentation which actually details the code examples to do so, but not really able to find it.

Gecofer
Contributor II

Hi @Rajat-TVSM 

These official Databricks links should help, as they cover the production-recommended way (Service Principal) and the Serving Endpoint API with examples:

Service Principal authentication
https://docs.databricks.com/en/dev-tools/auth/service-principals.html

Serving Endpoints REST API (Agent / Model Serving)
https://docs.databricks.com/api/workspace/servingendpoints

Hope this documentation helps.

nayan_wylde
Esteemed Contributor

To access an Agent serving endpoint without a Personal Access Token (PAT), you must use OAuth 2.0 Machine-to-Machine (M2M) authentication. This is the industry-standard approach for production applications.

1. OAuth M2M Authentication Workflow

Instead of a long-lived PAT, you use a Service Principal (an identity for your app) and a Client Secret to request short-lived (1-hour) access tokens.

Setup Steps

  1. Create a Service Principal: In your Databricks workspace, go to Settings > User Management > Service Principals.
  2. Generate a Secret: Select the service principal, go to the Secrets tab, and click Generate secret. Save the Client ID and Client Secret.
  3. Assign Permissions: Go to the Serving tab, select your agent endpoint, and under Permissions, grant your Service Principal Can Query permissions.

2. Access via Python (Databricks SDK)

The Databricks SDK handles the token lifecycle (fetching and refreshing) automatically if you provide the credentials.

import os
from databricks.sdk import WorkspaceClient

# Credentials should be stored in environment variables for security
w = WorkspaceClient(
    host="https://<workspace-instance-name>.cloud.databricks.com",
    client_id=os.environ.get("DATABRICKS_CLIENT_ID"),
    client_secret=os.environ.get("DATABRICKS_CLIENT_SECRET")
)

# Querying the agent endpoint
response = w.serving_endpoints.query(
    name="my-agent-endpoint",
    messages=[{"role": "user", "content": "How do I use this agent?"}]
)

print(response.choices[0].message.content)

3. Access via REST API

If you aren't using the Python SDK, you must manually fetch the token first.

Step 1: Fetch the OAuth Token

# Token URL format: https://<workspace-instance>/oidc/v1/token
curl -X POST "https://<workspace-instance>.cloud.databricks.com/oidc/v1/token" \
     -u "$CLIENT_ID:$CLIENT_SECRET" \
     -d "grant_type=client_credentials&scope=all-apis"

Step 2: Query the Agent Endpoint

curl -X POST "https://<workspace-instance>.cloud.databricks.com/serving-endpoints/my-agent-endpoint/invocations" \
     -H "Authorization: Bearer <access_token_from_step_1>" \
     -H "Content-Type: application/json" \
     -d '{"messages": [{"role": "user", "content": "Hello agent!"}]}'

 

Documentation Links