cancel
Showing results for 
Search instead for 
Did you mean: 
Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best practices. Connect with administrators and architects to optimize your Databricks environment for performance, scalability, and security.
cancel
Showing results for 
Search instead for 
Did you mean: 

Agent outside databricks communication with databricks MCP server

maikel
New Contributor II

Hello Community!

I have a following use case in my project:

User -> AI agent -> MCP Server -> Databricks data from unity catalog.

- AI agent is not created in the databricks
- MCP server is created in the databricks and should expose tools to get data from given unity catalog.

I can see in https://docs.databricks.com/aws/en/generative-ai/mcp/custom-mcp that it should be possible to host MCP in Databricks app however my question is, would it be possible to connect it with the agent outside the databricks? Additionally, shall I implement in my custom mcp server methods to work with databricks data via REST or it might be somehow shortened due to the fact that MCP is in the databricks. 

Thanks a lot for the support!
Michal

3 REPLIES 3

mark_ott
Databricks Employee
Databricks Employee

Yes, it is possible to connect an external AI agent to an MCP (Model-Serving Custom Platform) server hosted within Databricks, and there are some benefits and options for working with Databricks data that depend on your architecture choices.

Connecting External AI Agents to Databricks MCP

  • You can host your custom MCP server within Databricks, and expose its endpoints via the authentication and networking setup that Databricks provides (such as allowing external access via secure endpoints or APIs).

  • As long as your MCP server is accessible over the network (using public or secure/private endpoints that you configure), an AI agent running outside of Databricks can connect to it by making REST API calls or using SDKs compatible with the MCP interface.

MCP Methods for Unity Catalog Data

  • Direct Databricks Access: Since your MCP server is hosted inside Databricks, it typically has faster and more privileged access to Databricks-managed resources, like Unity Catalog. You can leverage Databricks-native APIs or even direct Spark jobs within the MCP server's code to access and process Unity Catalog data efficiently.

  • REST vs. Native Access:

    • If you host MCP inside Databricks, your MCP methods can directly interact with Unity Catalog using Databricks SDKs (Python, Scala, SQL) or Spark APIs without the overhead of going through REST APIs.

    • If you plan to keep MCP outside Databricks, you would have to use Databricks REST APIs to interact with Unity Catalog, which adds additional network calls and potential complexity.

  • Recommended: Since MCP is inside Databricks, implement direct access methods using Databricks APIs rather than wrapping REST calls—this approach is more streamlined, efficient, and secure.

Summary Table

Component Location Best Access Method Notes
AI Agent Outside Databricks MCP REST API Connects to MCP over network
MCP Server Inside Databricks Direct Databricks/Spark APIs Native, fast, secure
MCP Server Outside Databricks Databricks REST API Less direct, more overhead
Unity Catalog (Data Layer) Managed by Databricks
 
 

Key Recommendations

  • Connect your external AI agent to the MCP server via REST without issues as long as networking and permissions are properly configured.

  • Since MCP is running in Databricks, use internal Databricks APIs for Unity Catalog access instead of building REST-based data access in your MCP server logic.

This approach will offer you more efficient, secure, and robust access to your data within Databricks while supporting external AI agent connectivity.

maikel
New Contributor II

Hello Mark!

thanks a lot for the response! By MCP I meant Model Context Protocol Server, not Model-Serving Custom Platform.

I am thinking about those two approaches:

  • AI agent outside databricks -> MCP server outside databricks -> data in Unity Catalog

 

The reason for such approach is that I would like to have more control over MCP deployment, resources etc. In this scenario, as you mentioned we should go with REST API between MCP and Unity Catalog data. Could you please advice what is the best option for the authentication through the code (without browser involvement)? Can I create secure endpoint for this? If so, how can I do this? 

  • AI agent outside databricks -> MCP server inside databricks -> data in Unity Catalog

Can you please share more details how can I create secure endpoint in databricks that my agent can authenticate to MCP? 
If I have MCP inside databricks does it mean that in my Python code I can just use e.g. Pyspark functions to access it? No authorization required?

Thank you!

Best regards,

Michal

mark_ott
Databricks Employee
Databricks Employee

Hopefully this helps...

You can securely connect your external AI agent to a Model Context Protocol (MCP) server and Unity Catalog while maintaining strong control over authentication and resource management. The method depends on whether MCP is outside or inside Databricks. Below are best practices and details for secure endpoint creation and authentication.

MCP Outside Databricks

When MCP is outside Databricks and needs to access Unity Catalog data, REST API calls will be used. Secure, code-based authentication (no browser) is achievable using these methods:

  • Personal Access Token (PAT) Authentication:
    You can generate a Databricks PAT and include it in your REST API request headers using an Authorization token. This is suitable for automation and code-only flows, without browser involvement.

    • Generate PAT in Databricks: Go to User Settings > Access Tokens.

    • Use header:

      text
      Authorization: Bearer <token>
    • Store the PAT securely (environment variable, secret manager).

  • Service Principal (Workspace-Managed Identity):
    For production and enterprise setups, use a service principal registered with Databricks.

    • Authenticate via client credentials (client ID/secret or certificate) using OAuth2 flows from your MCP or agent’s code.

    • Obtain workspace and catalog access via API, using the principal’s scopes and roles.

  • Securing the Endpoint:

    • Host your MCP server on a secure cloud VM or service (ensure HTTPS/TLS).

    • Require authentication for access and provide only secure REST API endpoints.

    • Store credentials (tokens/secrets) outside codebase—preferably with a secret management service.

Secure Endpoints in Databricks

When deploying MCP inside Databricks, you can use Databricks native security and authentication mechanisms:

  • Databricks REST Endpoints:

    • You can create WebHDFS or standard REST endpoints in Databricks, protected by workspace authentication.

    • Set up an endpoint using Databricks Jobs, Delta Live Tables, or MLflow model serving.

    • Secure using Bearer tokens (PATs) or service principals, as above.

  • Unity Catalog Access Control:

    • Assign workspace, schema, and table permissions to users, groups, or service principals in the Unity Catalog.

    • Only entities with appropriate permissions (via access control lists/policies) can query data.

MCP Inside Databricks: Accessing via Python

  • Native Access to Unity Catalog:

    • If MCP is running inside a Databricks workspace (e.g., as a notebook, job, or managed MLflow endpoint), it can access Unity Catalog directly using PySpark, Databricks SQL, or REST API.

    • Authorization is seamless if the code runs under a user/service principal with catalog permissions.

    • Best practice: assign least-privileged roles, and audit usage.

  • No Explicit Authorization Required:

    • When running inside Databricks with correct role/ID, explicit extra authentication steps are not needed—Databricks manages session tokens.

    • Access is managed by Databricks’ authentication context, so spark.read.table("catalog.schema.table") will work if permissions are set.

Summary Table: Authentication Approaches

Scenario Endpoint Security Authentication in Code Browser needed? Notes
AI agent -> MCP outside Databricks -> Unity Catalog HTTPS REST API PAT / Service Principal No Secure with token header or OAuth2
MCP Endpoint inside Databricks Databricks REST/MLflow PAT / Service Principal No Native workspace authentication
MCP via PySpark inside Databricks Databricks runtime Workspace session No Managed by workspace/session context
 
 

Key Recommendations

  • Always use HTTPS for all endpoints.

  • Prefer service principals and managed identities for scalable, secure automation.

  • Store tokens and secrets securely, using environment variables or cloud secret managers.

  • Set fine-grained Unity Catalog permissions to limit data access to only what’s needed.

  • For MCP inside Databricks, leverage PySpark/DataFrame APIs for direct access, with minimal authentication setup required.

If you need code samples or steps for endpoint setup or OAuth2 authorization, please specify which platform (Azure, AWS, GCP) and Databricks environment you’re using.


For technical implementation details, including authorization flows and secure endpoint creation inside/outside Databricks, review the official Databricks documentation and your cloud provider's security guidelines.