cancel
Showing results for 
Search instead for 
Did you mean: 
Technical Blog
Explore in-depth articles, tutorials, and insights on data analytics and machine learning in the Databricks Technical Blog. Stay updated on industry trends, best practices, and advanced techniques.
cancel
Showing results for 
Search instead for 
Did you mean: 
alexandergenser
Databricks Employee
Databricks Employee

Building agents with the Model Context Protocol on Databricks

In the past year, AI applications and their capabilities have advanced significantly from LLM prompting to retrieval-augmented generation (RAG), to the point where we started developing agentic AI systems that can access tools to query external data sources, call an API, or search the web. By now, there are plenty of frameworks available that allow you to a) build these agentic systems and b) provide functionality that allows you to implement, orchestrate, and make tools available to your agent. 

If we scale agent development within an enterprise organization, we certainly want a standard on how agents and tools communicate and how new tools are integrated into a large ecosystem. This is where the Model Context Protocol (MCP) developed by Anthropic in 2024 comes into play. MCP is an open-source standard for the connection of agents to external systems (or to quote the MCP documentation, “Think of MCP like a USB-C port for AI applications”). In 2026, MCP has become the industry standard. To learn more about the basics of MCP, visit the official documentation website.

In this blog, we are going to explore how to leverage MCP on Databricks. For this, we will extend the travel agent use case from this blog to a fully fledged travel assistant backed by MCP on Databricks. In short, this agent accesses live train network data to provide the fastest routes to any destination. First, we will outline how you can leverage MCP on Databricks and design the architecture of the AI travel assistant. Then we are going to build and test our agent system, in the Databricks Playground for rapid testing, but also with code that allows us to then evaluate and deploy the solution to production.

 

MCP on Databricks - What are managed, external and custom MCP servers?

MCP is now fully integrated into the Databricks Mosaic AI (a suite of tooling designed to help developers build and deploy high-quality generative AI applications), alongside Unity Catalog (UC) function tools and agent code tools, so you can decide, per use case, based on governance, flexibility, and integrability requirements, what to use (see the documentation for some guidance).

Databricks MCP can be deployed in three ways: managed MCP servers, external MCP servers, and custom MCP servers. We walk through the options briefly in the following, based on our travel agent example. Let’s explore the building blocks of our agent in the following Figure 1.

arch.png

Figure 1: High-level architecture of the travel assistant Agent with MCP.

Assume we want to develop an agent that is capable of giving a helpful and accurate response to the following query: 

“I want to go Nordic skiing in Zermatt, Switzerland. I want to go tomorrow, early morning, from Zurich HB. Make a plan!” 

To give a tailored and helpful answer, we need more than just a generic LLM response. Hence, we are leveraging three types of MCP servers that we create in this blog:

  • Managed MCP server: A UC function that provides train connections from the TransportAPI in Switzerland. 
  • External MCP server: An external web search service using You.com to retrieve relevant leisure/sport information from the web in real-time.
  • Custom MCP server: FastMCP implementation hosted via a Databricks App that allows to retrieve data from the AccuWeather API.

Before we dive deeper and start developing the MCP servers in Databricks, we quickly want to highlight what each type of MCP server represents and when to use what:

Managed MCP servers

Databricks managed MCP servers provide ready to use endpoints. These allow both internal AI agents developed in Databricks and external clients to leverage functionality like UC Functions, Vector Search, Genie spaces, and the DatabricksSQL (DBSQL) engine.  All REST API–based servers are fully enforcing data permissions and governance via UC, and no infrastructure management is required. For example, a managed MCP providing UC function tools is exposed as the following URL pattern:

https://<workspace-hostname>/api/2.0/mcp/functions/{catalog}/{schema}

The URL can then be leveraged by an agent deployed with Databricks (as we are going to do below), but also by external systems that want to leverage managed MCPs (e.g., a Genie space). See the documentation of managed MCPs for more information and example notebooks.

External MCP servers

Leveraging third-party MCP servers that are hosted outside of Databricks can be implemented via External MCP Servers. Managed proxy endpoints and UC HTTP connections allow for secure access to external tools and APIs without exposing credentials. To install external servers, one can either discover and install an MCP Server from the Databricks Marketplace or create a custom UC connection. Exposure is handled via the following URL pattern:

https://<workspace-hostname>/api/2.0/mcp/external/{connection_name}

Check out the documentation of managed MCPs to find a setup guide and examples for programmatically using them.

Custom MCP servers

Databricks custom MCP servers enable you to host your own MCP server implementation or third-party MCP server implementations as Databricks Apps, providing a simple and managed way to expose custom tools to agents. For implementation, the deployment of a Databricks App is required, and servers must implement an HTTP‑compatible transport (such as Streamable HTTP). After the app deployment, the endpoint is accessible with the pattern:

https://<app‑url>/mcp

For setting up the environment, implementing the MCP server, and deploying the server as a Databricks App, refer to the documentation. Note that programmatic access using DatabricksMCPClient with a WorkspaceClient (to list tools and invoke them) requires corresponding authentication mechanisms depending on which client is used (e.g., in notebooks, the default authentication for the WorkspaceClient is disabled, and you need a service principal. See here for details). We will also showcase this setup in sections below.

 

Developing the MCP servers

We will now follow Figure 1 above and build three different MCP servers that provide the tools for the agent to answer travel assistant queries from users.

Train connection retrieval with Databricks-managed MCP server

We are starting with the managed MCP server, which hosts and exposes a UC function, and performing an API call to the connections endpoint of the Transport API. Note that more information on the API and UC functions can be found in this blog here - check it out for details. Here is the code for our UC function:

%sql
CREATE OR REPLACE FUNCTION travel_agents.train_agent.get_connections(
   from_station STRING COMMENT 'The train station of departure',
   to_station STRING COMMENT 'The train station of arrival',
   via_station STRING COMMENT 'The desired stops in between departure and arrival',
   date STRING COMMENT 'Date of the connection, in the format YYYY-MM-DD',
   time STRING COMMENT 'Time of the connection, in the format hh:mm'
)
RETURNS STRING
COMMENT 'Executes a call to the transport api and connections endpoint to retrieve relevant train connections given the input parameters from (departure), to (arrival), via (stops in between, if specified), date, and time.'
LANGUAGE PYTHON
AS $$
import requests

url = "http://transport.opendata.ch/v1/connections"
params = {
   "from": from_station,
   "to": to_station,
   "via": via_station,
   "date": date,
   "time": time,
   "transportations": "train"
}

response = requests.get(url, params=params)
if response.status_code == 200:
   next_connection = response.json()
   return next_connection['connections']
else:
   return f"Failed to retrieve connection. Status code: {response.status_code}"
$$;

Note that a catalog travel_agents and a schema within this catalog train_agent need to be created prior to running this code. What happens automatically when we create this UC function is that a managed MCP server is exposed via the following endpoint:

https://<workspace-hostname>/api/2.0/mcp/functions/travel_agents/train_agent

The hostname can be derived from the URL of your Databricks workspace. For example, if you use Azure Databricks, this will resemble https://adb-xxxx.yy.azuredatabricks.net/Note that the endpoint above exposes the schema, meaning that all functions created within that schema will be exposed as individual tools of that managed MCP server. Having this functionality created, we can use it below a) in the Playground for rapid prototyping, b) take the endpoint URL and use it when implementing our agent with Python. Both options are outlined in the next sections.

Activity retrieval with an external web search MCP server

To find the most relevant activities at your travel destination, we recommend performing a web search. This can help to get the best-rated skiing resorts, to find beginner hikes in the area, or even to check if resorts/facilities have open/pricing/conditions/etc. We will use You.com’s Web Search API, which also exposes an MCP Server. Follow this guide to create an API Key (Bearer Token). 


In Figure 2, the steps of creating a UC connection are depicted. If you navigate to the Catalog -> External Data -> Connections, you can see below that there is a connection that points to the API of You.com and more specifically to the MCP Server (/mcp). By following the steps in Figure 2, you are creating such an HTTP connection, entering the obtained URL and Bearer token from You.com, and ensuring that the base path is /mcp.

fiiinnnaaal.png

Figure 2: Creating a UC connection to the external MCP server.

This setup ensures that the connection to external MCP servers is both secure and easily sharable within Databricks workspaces, leveraging managed proxies and automated OAuth flows. Additionally, UC enables discoverability and governance of these connections, simplifying compliance and operational management for external integrations.

Finally, we have a UC connection to an external MCP Server exposed via the following endpoint to be leveraged via the Databricks Playground or in custom implementations:

https://<workspace-hostname>/api/2.0/mcp/external/mcp-websearch-assistant

Weather forecast retrieval with a custom MCP server

Finally, we aim to enhance our agentic system with the capability to retrieve a weather forecast for a specific location and within a reasonable time frame that aligns with the travel plan. For the implementation, we are going to do the following steps:

  • Deploy a Databricks app to host our custom MCP server
  • Utilize an MCP app template with FastMCP to build our custom MCP application
  • Implement two tools that use the AccuWeather APIs to retrieve weather forecasts

To deploy a Databricks App, you can follow the steps outlined in the Databricks Apps Cookbook. We are going to leverage one of the new Agent templates for Databricks Apps,  “MCP Server - Simple” that uses FastMCP under the hood. The template has dependencies already pre-installed and comes with operational tools such as the MCP server health check out of the box. 

Important to note here: Ensure the prefix ‘mpc-’ is used when specifying the app name, as this is crucial for the Playground to recognize the custom MCP server.

deploy_app.png

Figure 3: Creating a custom MCP server by using the agent templates of Databricks Apps.

After successfully creating the custom MCP server, we sync the project to our local IDE using the Databricks CLI. You can use your favorite IDE (VSCode, PyCharm, Cursor, or others). Ensure you have a working setup of the Databricks CLI and an authentication profile configured for your workspace (documentation). Let’s sync the project with the following CLI command:

databricks workspace export-dir /Workspace/Users/<username>/mcp-weather-assistant -p <profile_name> .

Now we can start implementing two tools that will, under the hood, make two separate API calls. First, we need to retrieve an AccuWeather-specific location_key. This is necessary, as the forecast API endpoint does not accept a string for locations but rather an explicit integer. We then need to use the location_key to get a forecast with two parameters: forecast_type, which represents the granularity, e.g., ‘hourly’ or ‘daily’, and forecast_horizon, representing the horizon in the format of ‘24hour’, ‘1day’, etc.

We will implement two functions that represent MCP tools. FastMCP provides a set of decorators for implementing MCP functionality. In the app.py template, an object of the FastMCP class is already instantiated, which we will use in the decorator to declare a callable tool, i.e., @mcp_server.tool. 

Investigating the AccuWeather Core Weather API documentation, we implement the following function to retrieve the location_key:

@mcp_server.tool
def get_location_key_by_city(city_name: str):
      """
      Get the AccuWeather location key for a given city name.This tool searches for a city     
      by name and returns its unique AccuWeather location key, which is required for   
      fetching weather forecasts.

      Args:
          city_name: The name of the city to search for (e.g., "Zurich", "New York")
      Returns:
          dict: Contains 'key' (the location key string) and 'location' (full location     
                details)
      Raises:
          ValueError: If no locations are found for the given city name
          requests.HTTPError: If the API request fails
      """
      url = f"{BASE_URL}/locations/v1/cities/search"
      params = {"apikey": api_key, "q": city_name, "language": "en-us"}

      r = requests.get(url, params=params)
      r.raise_for_status()
      locations = r.json()
      if not locations:
          raise ValueError(f"No locations found for {city_name}")

      return {"key": locations[0]["Key"], "location": locations[0]}

By returning the first element of the retrieved location dictionary, we provide the best match and extract the corresponding location_key. As the next step, we use this integer to retrieve a forecast of that location. As the API is flexible in terms of forecast horizons (check for what a premium plan is needed!), we implement a tool that takes the forecast_type and also the forecast_horizon as parameters:

@mcp_server.tool
def get_24_hours_weather_forecast_by_location_key(location_key: str, 
    forecast_type: str = "hourly", forecast_horizon: str = "24hour"):
    """
    Get weather forecast for a given AccuWeather location key.
    Fetches hourly weather predictions for the forecast type and horizon, including
    temperature, precipitation probability, and weather conditions.

    Args:
        location_key: The AccuWeather location key (obtained from 
                      get_location_key_by_city)
        forecast_type: The type of forecast to fetch (e.g., "hourly", "daily")
        forecast_horizon: The horizon of the forecast to fetch (e.g., "24hour", "72hour", 
                       "1day", "7day")
    Returns:
        list: Array of hourly forecast objects containing:
             - DateTime: ISO timestamp for the forecast hour
             - Temperature: Temperature value and unit (metric/Celsius)
             - IconPhrase: Text description of weather conditions
             - PrecipitationProbability: Chance of precipitation (0-100%)
    """
    url = f"{BASE_URL}/forecasts/v1/{forecast_type}/{forecast_horizon}/{location_key}"
    params = {"apikey": api_key, "language": "en-us", "metric": "true"}
    r = requests.get(url, params=params)
    return r.json()

Note that we declared two variables outside of the tool implementation. The corresponding base URL to the API endpoints and also the apikey which corresponds to the retrieved API Key from AccuWeather that we fetch via dbutils (see documentation)

BASE_URL = "https://dataservice.accuweather.com"
api_key = dbutils.secrets.get(scope="<scope-name>", key="<key-name>")

Finally, we implement these code snippets into the tool.py of our Databricks App project. If we are done, we can sync the code back to our Databricks Workspace by running

databricks sync --watch . /Workspace/Users/<username>/databricks_apps/mcp-weather-assistant -p <profile_name>

and deploy the app with the following CLI command (you can also achieve this in the UI by clicking the deploy button):

databricks apps deploy mcp-weather-assistant --source-code-path /Workspace/Users/<username>/mcp-weather-assistant

We have successfully deployed our custom weather forecast MCP server on Databricks!

 

Bringing it all together in the Databricks Playground

For rapid prototyping and testing of our agent, we will use the Databricks Playground in the Databricks UI. MCP Servers are now fully integrated with the Playground and we can just select the servers we created above. As highlighted below, navigate to the Playground -> Add tool -> MCP servers. From there we can go ahead and include our MCP servers created above. Note that for UC functions, we select the schema that contains our function. This way, if we would create multiple functions in one schema, we can expose all of them as managed MCP server tools.

small_playground.drawio.png

Figure 4: Building an agent prototype with MCP servers in the Databricks Playground.

As a last step, let us add a system prompt to our agent prototype that makes sure its aware about the task at hand and rules regarding output tone, clarifications needed, and the workflow with the provided tools:

Respond **only** to travel-related questions about **train connections**, **weather**, or **sports/leisure activities**.

Politely refuse anything outside these topics.

Rules
- Stay within travel, weather, sport, and leisure domains.
- If uncertain, ask for clarification or refuse.
- Be concise, factual, and well-organized.
- Always produce a short travel plan including:
    - Train connection
    - Weather summary
    - Recommended activity
    - One-line reason linking activity to weather/context
- Replace umlauts in your output.

Workflow (in particular order)
1. Ask for departure station, destination, and activity if missing.
2. Check weather (use daily forecast if >12h ahead).
3. Search local sports/leisure options and pick up to 3 fitting the weather.
4. Find train connections.
5. Output a concise final travel plan with train, weather, activity, and reason.

Refusal:
If the request is not about travel, weather, or leisure/sport: “Sorry, I can only help with travel-related questions such as train connections, weather, or sports and leisure activities.”

Finally, let’s give it a try as we want to go skiing tomorrow in famous Zermatt, Switzerland: 

“I want to go skiing tomorrow (14.01.2026) in Zermatt, Switzerland. I want to start at Zurich HB in the early morning.”

Our agent reasons about our query, calls the respective four tools of our MCP servers and crafts the following response, representing a decent travel plan: 

output_playground.png

Figure 5: Agent response in Databricks Playground.

Note that the automatic creation of an agent notebook will be available soon, in conjunction with the use of MCP servers. Agent notebooks provide an out-of-the-box Python definition of your agent, including testing and evaluation, registering your agent in UC, and deploying it as an endpoint. See the documentation for more details.

 

Bringing it all together in code

We also want to showcase here how to define our agent in code so that we can then move our work to the testing, evaluation, and deployment phase. We will start by using a sample notebook from the Databricks documentation that implements an agent with LangGraph. Clone the notebook by copying the link (‘copy link for import’) and importing it with the URL option in your workspace. Going through the notebook, we will implement a few changes in the agent definition in code cell #5.

Starting with the following snippet, we want to a) change the LLM endpoint name (see full list of foundational models here) to GTP 5.2 and also, we want to include our system prompt from above:

LLM_ENDPOINT_NAME = "databricks-gpt-5-2"
llm = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME)

system_prompt = """
Respond **only** to travel-related questions about **train connections**, 
**weather**, or **sports/leisure activities**.

Politely refuse anything outside these topics.

Rules
- Stay within travel, weather, sport, and leisure domains.
- If uncertain, ask for clarification or refuse.
- Be concise, factual, and well-organized.
- Always produce a short travel plan including:
      - Train connection
      - Weather summary
      - Recommended activity
      - One-line reason linking activity to weather/context
- Replace umlauts in your output.

Workflow (in particular order)
1. Ask for departure station, destination, and activity if missing.
2. Check weather (use daily forecast if >12h ahead).
3. Search local sports/leisure options and pick up to 3 fitting the weather.
4. Find train connections.
5. Output a concise final travel plan with train, weather, activity, and reason.

Refusal:
If the request is not about travel, weather, or leisure/sport: “Sorry, I can 
only help with travel-related questions such as train connections, weather, or 
sports and leisure activities.”
"""

As a next step, we need to configure the WorkspaceClient, which is a part of the Databricks SDK, representing a unified interface to interact with APIs of the Databricks Workspace. 

In the simplest way, if you create a WorkspaceClient object in a notebook, Databricks manages the connection automatically with your workspace settings and a Personal Access Token (PAT). Nevertheless, as we are using a custom MCP server, OAuth with a service principal for machine-to-machine (M2M) authentication is required. Follow the outlined steps to retrieve the needed credentials for OAuth M2M:  

  • Make sure you have a Service Principal (SP) created that you want to use for this application. See the documentation on how to create an SP on the account/workspace level. 
  • Ensure that your SP has entitlements for workspace access and (if tools require) Databricks SQL access (these options can be found in the configuration tab of your SP). 
  • Create a secret for OAuth, following the steps in the documentation, and store the client ID and secret again with dbutils.
  • Make sure the SP has the following permissions:
    • EXECUTE UC FUNCTION for the usage of the managed MCP server (allow also the grant for USE CATALOG and USE SCHEMA)
    • CAN USE for the UC connection pointing to the external MCP server
    • CAN USE on the Databricks App that hosts the custom MCP server.

We are now ready to use the SP credentials (client ID and secret stored in dbutils) to create the WorkspaceClient. In the sample notebook you can comment out or delete the standard WorkspaceClient declaration with PAT.

import os

client_id = dbutils.secrets.get(scope="<scope-name>", key="<client-id>")
client_secret = dbutils.secrets.get(scope="<scope-name>", key="<client-secret>")

mcp_server_workspace_client = WorkspaceClient(
    host="https://adb-984752964297111.11.azuredatabricks.net/",
    client_id=client_id,
    client_secret=client_secret,
    auth_type="oauth-m2m",  # Enables service principal authentication
)

After a successful authentication, we leverage the DatabricksMultiServerMCPClient to define our three MCP Servers. As we are using for all of them our SP with OAuth M2M, the WorkspaceClient object from above is passed along for each server object:

host = mcp_server_workspace_client.config.host

databricks_mcp_client = DatabricksMultiServerMCPClient(
   [
       DatabricksMCPServer(
           name="train-connection",
           url=f"{host}/api/2.0/mcp/functions/travel_agent/train_agent",
           workspace_client=mcp_server_workspace_client
       ),
       DatabricksMCPServer(
           name="web-search",
           url=f"{host}/api/2.0/mcp/external/mcp-websearch-assistant",
           workspace_client=mcp_server_workspace_client
       ),
       DatabricksMCPServer(
           name="weather",      
           url="https://mcp-server-try-out-984752964297111.11.azure.databricksapps.com/mcp",
           workspace_client=mcp_server_workspace_client
       )
   ]
)

The remaining code within the notebook cell is sufficient for this project, and we can run the notebook. The output of cell #5 produces an agent.py, that we can in the following import and test our agent. The next code sample invokes the agent with the same query that we used in the Playground example above. For cosmetics, we use IPython to format the output nicely in the notebook.

from agent import AGENT
from IPython.display import Markdown, display

# Test with a travel-related query
response = AGENT.predict({
   "input": [{"role": "user", "content": "I want to go skiing tomorrow (14.01.2026) in 
    Zermatt, Switzerland. I want to start at Zurich HB in the early morning."}]
})

# Extract the text content from the response
if response.output and len(response.output) > 0:
   text_content = ""
   for item in response.output:
       if hasattr(item, 'content') and item.content:
           for content_item in item.content:
               if isinstance(content_item, dict) and content_item.get('type') == 'output_text':
                   text_content += content_item.get('text', '')
   # Display as markdown
   display(Markdown(text_content))
else:
   print("No response content to display")

Finally, we can inspect the output and evaluate the travel plan the agent has crafted. If we compare the output with the one from the Playground in Figure 5, we can see that the train connection and also the weather data match exactly. Upon inspecting the recommended activity, we notice that the results are not identical. The reason for this lies in the implementation differences of our agent code above and the Databricks Playground, which result in different input queries to the web search MCP server.

output_agent_implementation_notebook.png

Figure 6: Agent response with LangGraph notebook implementation.

 

Testing, Evaluation, and deploying the agent to production

To ensure the reliability and efficacy of our multi-tool agent, rigorous testing and evaluation are essential before deployment to production. The Mosaic AI Agent Evaluation within Databricks offers a comprehensive and structured approach for evaluating agent performance. 

The framework supports programmatic evaluation using LLM-as-a-judge patterns, allowing you to define ground truth scenarios, run the agent against them, and compute metrics automatically. This enables A/B testing between different LLM endpoints, prompt variations, or MCP server configurations. The notebook we have been working with during this project contains the basics of how to work with the evaluation framework. LLM-as-a-judge is explained in detail here

Once the agent has met performance criteria, it is ready for deployment. The notebook contains the following two steps for the path to production:

  1. MLflow Registration: The LangGraph-based agent code can be logged and registered in the Unity Catalog using MLflow, establishing a clear lineage and version control for the agent model.
  2. Model Serving Endpoint: The registered agent can then be deployed as a low-latency model serving endpoint on Databricks. This process automatically handles the necessary infrastructure scaling and OAuth M2M configurations for tool access.

Finally, if you want to create your own custom front-end for your agent, you can deploy a dedicated Databricks App that hosts your application and calls the deployed model serving endpoint. An example can be found in the Databricks Apps Cookbook here.

 

Conclusion

This blog post demonstrated how to build a sophisticated, agentic travel assistant application by leveraging the Model Context Protocol (MCP) on Databricks. We showcased the integration of three distinct MCP server patterns (managed, external, and custom) that allow a) to retrieve train connections from an API, b) to perform a web search with you.com, and b) obtain a weather forecast for a specific location. The showcase highlights a scalable, secure, and governed path for enterprise-wide agent development through the Databricks Agent Framework and Unity Catalog.

Key Takeaways:

  1. MCP Standardizes Tool Interoperability: The Model Context Protocol (MCP) serves as the standardized interface, crucial for scaling AI agent development within organizations by ensuring seamless and orchestrated access to Databricks services and external systems.
  2. Comprehensive MCP Support and Interoperability: Databricks fully supports the MCP protocol, offering developers maximum flexibility for governance and security. This support includes deployment patterns for managed, external, and custom MCP servers. Furthermore, this architecture allows external services to leverage Databricks services via the governed endpoints, ensuring discoverability and bidirectional communication.
  3. Secure, Structured Path to Production: The platform provides a clear production path, transitioning from the Databricks Playground to defining the agent in code (LangGraph), securing access with OAuth, and deploying the agent via MLflow and Model Serving endpoints, ensuring reliable and governed agent-based AI. Drive continuous improvement and system reliability with ongoing monitoring and rigorous evaluation with MLflow.
1 Comment