a week ago
Hello Community!
I have a following use case in my project:
User -> AI agent -> MCP Server -> Databricks data from unity catalog.
- AI agent is not created in the databricks
- MCP server is created in the databricks and should expose tools to get data from given unity catalog.
I can see in https://docs.databricks.com/aws/en/generative-ai/mcp/custom-mcp that it should be possible to host MCP in Databricks app however my question is, would it be possible to connect it with the agent outside the databricks? Additionally, shall I implement in my custom mcp server methods to work with databricks data via REST or it might be somehow shortened due to the fact that MCP is in the databricks.
Thanks a lot for the support!
Michal
a week ago
Yes, it is possible to connect an external AI agent to an MCP (Model-Serving Custom Platform) server hosted within Databricks, and there are some benefits and options for working with Databricks data that depend on your architecture choices.
You can host your custom MCP server within Databricks, and expose its endpoints via the authentication and networking setup that Databricks provides (such as allowing external access via secure endpoints or APIs).
As long as your MCP server is accessible over the network (using public or secure/private endpoints that you configure), an AI agent running outside of Databricks can connect to it by making REST API calls or using SDKs compatible with the MCP interface.
Direct Databricks Access: Since your MCP server is hosted inside Databricks, it typically has faster and more privileged access to Databricks-managed resources, like Unity Catalog. You can leverage Databricks-native APIs or even direct Spark jobs within the MCP server's code to access and process Unity Catalog data efficiently.
REST vs. Native Access:
If you host MCP inside Databricks, your MCP methods can directly interact with Unity Catalog using Databricks SDKs (Python, Scala, SQL) or Spark APIs without the overhead of going through REST APIs.
If you plan to keep MCP outside Databricks, you would have to use Databricks REST APIs to interact with Unity Catalog, which adds additional network calls and potential complexity.
Recommended: Since MCP is inside Databricks, implement direct access methods using Databricks APIs rather than wrapping REST callsโthis approach is more streamlined, efficient, and secure.
| Component | Location | Best Access Method | Notes |
|---|---|---|---|
| AI Agent | Outside Databricks | MCP REST API | Connects to MCP over network |
| MCP Server | Inside Databricks | Direct Databricks/Spark APIs | Native, fast, secure |
| MCP Server | Outside Databricks | Databricks REST API | Less direct, more overhead |
| Unity Catalog (Data Layer) | Managed by Databricks | โ | โ |
Connect your external AI agent to the MCP server via REST without issues as long as networking and permissions are properly configured.
Since MCP is running in Databricks, use internal Databricks APIs for Unity Catalog access instead of building REST-based data access in your MCP server logic.
This approach will offer you more efficient, secure, and robust access to your data within Databricks while supporting external AI agent connectivity.
a week ago
Hello Mark!
thanks a lot for the response! By MCP I meant Model Context Protocol Server, not Model-Serving Custom Platform.
I am thinking about those two approaches:
The reason for such approach is that I would like to have more control over MCP deployment, resources etc. In this scenario, as you mentioned we should go with REST API between MCP and Unity Catalog data. Could you please advice what is the best option for the authentication through the code (without browser involvement)? Can I create secure endpoint for this? If so, how can I do this?
Can you please share more details how can I create secure endpoint in databricks that my agent can authenticate to MCP?
If I have MCP inside databricks does it mean that in my Python code I can just use e.g. Pyspark functions to access it? No authorization required?
Thank you!
Best regards,
Michal
12 hours ago
Hopefully this helps...
You can securely connect your external AI agent to a Model Context Protocol (MCP) server and Unity Catalog while maintaining strong control over authentication and resource management. The method depends on whether MCP is outside or inside Databricks. Below are best practices and details for secure endpoint creation and authentication.
When MCP is outside Databricks and needs to access Unity Catalog data, REST API calls will be used. Secure, code-based authentication (no browser) is achievable using these methods:
Personal Access Token (PAT) Authentication:
You can generate a Databricks PAT and include it in your REST API request headers using an Authorization token. This is suitable for automation and code-only flows, without browser involvement.
Generate PAT in Databricks: Go to User Settings > Access Tokens.
Use header:
Authorization: Bearer <token>
Store the PAT securely (environment variable, secret manager).
Service Principal (Workspace-Managed Identity):
For production and enterprise setups, use a service principal registered with Databricks.
Authenticate via client credentials (client ID/secret or certificate) using OAuth2 flows from your MCP or agentโs code.
Obtain workspace and catalog access via API, using the principalโs scopes and roles.
Securing the Endpoint:
Host your MCP server on a secure cloud VM or service (ensure HTTPS/TLS).
Require authentication for access and provide only secure REST API endpoints.
Store credentials (tokens/secrets) outside codebaseโpreferably with a secret management service.
When deploying MCP inside Databricks, you can use Databricks native security and authentication mechanisms:
Databricks REST Endpoints:
You can create WebHDFS or standard REST endpoints in Databricks, protected by workspace authentication.
Set up an endpoint using Databricks Jobs, Delta Live Tables, or MLflow model serving.
Secure using Bearer tokens (PATs) or service principals, as above.
Unity Catalog Access Control:
Assign workspace, schema, and table permissions to users, groups, or service principals in the Unity Catalog.
Only entities with appropriate permissions (via access control lists/policies) can query data.
Native Access to Unity Catalog:
If MCP is running inside a Databricks workspace (e.g., as a notebook, job, or managed MLflow endpoint), it can access Unity Catalog directly using PySpark, Databricks SQL, or REST API.
Authorization is seamless if the code runs under a user/service principal with catalog permissions.
Best practice: assign least-privileged roles, and audit usage.
No Explicit Authorization Required:
When running inside Databricks with correct role/ID, explicit extra authentication steps are not neededโDatabricks manages session tokens.
Access is managed by Databricksโ authentication context, so spark.read.table("catalog.schema.table") will work if permissions are set.
| Scenario | Endpoint Security | Authentication in Code | Browser needed? | Notes |
|---|---|---|---|---|
| AI agent -> MCP outside Databricks -> Unity Catalog | HTTPS REST API | PAT / Service Principal | No | Secure with token header or OAuth2 |
| MCP Endpoint inside Databricks | Databricks REST/MLflow | PAT / Service Principal | No | Native workspace authentication |
| MCP via PySpark inside Databricks | Databricks runtime | Workspace session | No | Managed by workspace/session context |
Always use HTTPS for all endpoints.
Prefer service principals and managed identities for scalable, secure automation.
Store tokens and secrets securely, using environment variables or cloud secret managers.
Set fine-grained Unity Catalog permissions to limit data access to only whatโs needed.
For MCP inside Databricks, leverage PySpark/DataFrame APIs for direct access, with minimal authentication setup required.
If you need code samples or steps for endpoint setup or OAuth2 authorization, please specify which platform (Azure, AWS, GCP) and Databricks environment youโre using.
For technical implementation details, including authorization flows and secure endpoint creation inside/outside Databricks, review the official Databricks documentation and your cloud provider's security guidelines.
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now