In the past year, AI applications and their capabilities have advanced significantly from LLM prompting to retrieval-augmented generation (RAG), to the point where we started developing agentic AI systems that can access tools to query external data sources, call an API, or search the web. By now, there are plenty of frameworks available that allow you to a) build these agentic systems and b) provide functionality that allows you to implement, orchestrate, and make tools available to your agent.
If we scale agent development within an enterprise organization, we certainly want a standard on how agents and tools communicate and how new tools are integrated into a large ecosystem. This is where the Model Context Protocol (MCP) developed by Anthropic in 2024 comes into play. MCP is an open-source standard for the connection of agents to external systems (or to quote the MCP documentation, “Think of MCP like a USB-C port for AI applications”). In 2026, MCP has become the industry standard. To learn more about the basics of MCP, visit the official documentation website.
In this blog, we are going to explore how to leverage MCP on Databricks. For this, we will extend the travel agent use case from this blog to a fully fledged travel assistant backed by MCP on Databricks. In short, this agent accesses live train network data to provide the fastest routes to any destination. First, we will outline how you can leverage MCP on Databricks and design the architecture of the AI travel assistant. Then we are going to build and test our agent system, in the Databricks Playground for rapid testing, but also with code that allows us to then evaluate and deploy the solution to production.
MCP is now fully integrated into the Databricks Mosaic AI (a suite of tooling designed to help developers build and deploy high-quality generative AI applications), alongside Unity Catalog (UC) function tools and agent code tools, so you can decide, per use case, based on governance, flexibility, and integrability requirements, what to use (see the documentation for some guidance).
Databricks MCP can be deployed in three ways: managed MCP servers, external MCP servers, and custom MCP servers. We walk through the options briefly in the following, based on our travel agent example. Let’s explore the building blocks of our agent in the following Figure 1.
Figure 1: High-level architecture of the travel assistant Agent with MCP.
Assume we want to develop an agent that is capable of giving a helpful and accurate response to the following query:
“I want to go Nordic skiing in Zermatt, Switzerland. I want to go tomorrow, early morning, from Zurich HB. Make a plan!”
To give a tailored and helpful answer, we need more than just a generic LLM response. Hence, we are leveraging three types of MCP servers that we create in this blog:
Before we dive deeper and start developing the MCP servers in Databricks, we quickly want to highlight what each type of MCP server represents and when to use what:
Databricks managed MCP servers provide ready to use endpoints. These allow both internal AI agents developed in Databricks and external clients to leverage functionality like UC Functions, Vector Search, Genie spaces, and the DatabricksSQL (DBSQL) engine. All REST API–based servers are fully enforcing data permissions and governance via UC, and no infrastructure management is required. For example, a managed MCP providing UC function tools is exposed as the following URL pattern:
https://<workspace-hostname>/api/2.0/mcp/functions/{catalog}/{schema}
The URL can then be leveraged by an agent deployed with Databricks (as we are going to do below), but also by external systems that want to leverage managed MCPs (e.g., a Genie space). See the documentation of managed MCPs for more information and example notebooks.
Leveraging third-party MCP servers that are hosted outside of Databricks can be implemented via External MCP Servers. Managed proxy endpoints and UC HTTP connections allow for secure access to external tools and APIs without exposing credentials. To install external servers, one can either discover and install an MCP Server from the Databricks Marketplace or create a custom UC connection. Exposure is handled via the following URL pattern:
https://<workspace-hostname>/api/2.0/mcp/external/{connection_name}
Check out the documentation of managed MCPs to find a setup guide and examples for programmatically using them.
Databricks custom MCP servers enable you to host your own MCP server implementation or third-party MCP server implementations as Databricks Apps, providing a simple and managed way to expose custom tools to agents. For implementation, the deployment of a Databricks App is required, and servers must implement an HTTP‑compatible transport (such as Streamable HTTP). After the app deployment, the endpoint is accessible with the pattern:
https://<app‑url>/mcp
For setting up the environment, implementing the MCP server, and deploying the server as a Databricks App, refer to the documentation. Note that programmatic access using DatabricksMCPClient with a WorkspaceClient (to list tools and invoke them) requires corresponding authentication mechanisms depending on which client is used (e.g., in notebooks, the default authentication for the WorkspaceClient is disabled, and you need a service principal. See here for details). We will also showcase this setup in sections below.
We will now follow Figure 1 above and build three different MCP servers that provide the tools for the agent to answer travel assistant queries from users.
We are starting with the managed MCP server, which hosts and exposes a UC function, and performing an API call to the connections endpoint of the Transport API. Note that more information on the API and UC functions can be found in this blog here - check it out for details. Here is the code for our UC function:
%sql
CREATE OR REPLACE FUNCTION travel_agents.train_agent.get_connections(
from_station STRING COMMENT 'The train station of departure',
to_station STRING COMMENT 'The train station of arrival',
via_station STRING COMMENT 'The desired stops in between departure and arrival',
date STRING COMMENT 'Date of the connection, in the format YYYY-MM-DD',
time STRING COMMENT 'Time of the connection, in the format hh:mm'
)
RETURNS STRING
COMMENT 'Executes a call to the transport api and connections endpoint to retrieve relevant train connections given the input parameters from (departure), to (arrival), via (stops in between, if specified), date, and time.'
LANGUAGE PYTHON
AS $$
import requests
url = "http://transport.opendata.ch/v1/connections"
params = {
"from": from_station,
"to": to_station,
"via": via_station,
"date": date,
"time": time,
"transportations": "train"
}
response = requests.get(url, params=params)
if response.status_code == 200:
next_connection = response.json()
return next_connection['connections']
else:
return f"Failed to retrieve connection. Status code: {response.status_code}"
$$;
Note that a catalog travel_agents and a schema within this catalog train_agent need to be created prior to running this code. What happens automatically when we create this UC function is that a managed MCP server is exposed via the following endpoint:
https://<workspace-hostname>/api/2.0/mcp/functions/travel_agents/train_agent
The hostname can be derived from the URL of your Databricks workspace. For example, if you use Azure Databricks, this will resemble https://adb-xxxx.yy.azuredatabricks.net/. Note that the endpoint above exposes the schema, meaning that all functions created within that schema will be exposed as individual tools of that managed MCP server. Having this functionality created, we can use it below a) in the Playground for rapid prototyping, b) take the endpoint URL and use it when implementing our agent with Python. Both options are outlined in the next sections.
To find the most relevant activities at your travel destination, we recommend performing a web search. This can help to get the best-rated skiing resorts, to find beginner hikes in the area, or even to check if resorts/facilities have open/pricing/conditions/etc. We will use You.com’s Web Search API, which also exposes an MCP Server. Follow this guide to create an API Key (Bearer Token).
In Figure 2, the steps of creating a UC connection are depicted. If you navigate to the Catalog -> External Data -> Connections, you can see below that there is a connection that points to the API of You.com and more specifically to the MCP Server (/mcp). By following the steps in Figure 2, you are creating such an HTTP connection, entering the obtained URL and Bearer token from You.com, and ensuring that the base path is /mcp.
Figure 2: Creating a UC connection to the external MCP server.
This setup ensures that the connection to external MCP servers is both secure and easily sharable within Databricks workspaces, leveraging managed proxies and automated OAuth flows. Additionally, UC enables discoverability and governance of these connections, simplifying compliance and operational management for external integrations.
Finally, we have a UC connection to an external MCP Server exposed via the following endpoint to be leveraged via the Databricks Playground or in custom implementations:
https://<workspace-hostname>/api/2.0/mcp/external/mcp-websearch-assistant
Finally, we aim to enhance our agentic system with the capability to retrieve a weather forecast for a specific location and within a reasonable time frame that aligns with the travel plan. For the implementation, we are going to do the following steps:
To deploy a Databricks App, you can follow the steps outlined in the Databricks Apps Cookbook. We are going to leverage one of the new Agent templates for Databricks Apps, “MCP Server - Simple” that uses FastMCP under the hood. The template has dependencies already pre-installed and comes with operational tools such as the MCP server health check out of the box.
Important to note here: Ensure the prefix ‘mpc-’ is used when specifying the app name, as this is crucial for the Playground to recognize the custom MCP server.
Figure 3: Creating a custom MCP server by using the agent templates of Databricks Apps.
After successfully creating the custom MCP server, we sync the project to our local IDE using the Databricks CLI. You can use your favorite IDE (VSCode, PyCharm, Cursor, or others). Ensure you have a working setup of the Databricks CLI and an authentication profile configured for your workspace (documentation). Let’s sync the project with the following CLI command:
databricks workspace export-dir /Workspace/Users/<username>/mcp-weather-assistant -p <profile_name> .
Now we can start implementing two tools that will, under the hood, make two separate API calls. First, we need to retrieve an AccuWeather-specific location_key. This is necessary, as the forecast API endpoint does not accept a string for locations but rather an explicit integer. We then need to use the location_key to get a forecast with two parameters: forecast_type, which represents the granularity, e.g., ‘hourly’ or ‘daily’, and forecast_horizon, representing the horizon in the format of ‘24hour’, ‘1day’, etc.
We will implement two functions that represent MCP tools. FastMCP provides a set of decorators for implementing MCP functionality. In the app.py template, an object of the FastMCP class is already instantiated, which we will use in the decorator to declare a callable tool, i.e., @mcp_server.tool.
Investigating the AccuWeather Core Weather API documentation, we implement the following function to retrieve the location_key:
@mcp_server.tool
def get_location_key_by_city(city_name: str):
"""
Get the AccuWeather location key for a given city name.This tool searches for a city
by name and returns its unique AccuWeather location key, which is required for
fetching weather forecasts.
Args:
city_name: The name of the city to search for (e.g., "Zurich", "New York")
Returns:
dict: Contains 'key' (the location key string) and 'location' (full location
details)
Raises:
ValueError: If no locations are found for the given city name
requests.HTTPError: If the API request fails
"""
url = f"{BASE_URL}/locations/v1/cities/search"
params = {"apikey": api_key, "q": city_name, "language": "en-us"}
r = requests.get(url, params=params)
r.raise_for_status()
locations = r.json()
if not locations:
raise ValueError(f"No locations found for {city_name}")
return {"key": locations[0]["Key"], "location": locations[0]}
By returning the first element of the retrieved location dictionary, we provide the best match and extract the corresponding location_key. As the next step, we use this integer to retrieve a forecast of that location. As the API is flexible in terms of forecast horizons (check for what a premium plan is needed!), we implement a tool that takes the forecast_type and also the forecast_horizon as parameters:
@mcp_server.tool
def get_24_hours_weather_forecast_by_location_key(location_key: str,
forecast_type: str = "hourly", forecast_horizon: str = "24hour"):
"""
Get weather forecast for a given AccuWeather location key.
Fetches hourly weather predictions for the forecast type and horizon, including
temperature, precipitation probability, and weather conditions.
Args:
location_key: The AccuWeather location key (obtained from
get_location_key_by_city)
forecast_type: The type of forecast to fetch (e.g., "hourly", "daily")
forecast_horizon: The horizon of the forecast to fetch (e.g., "24hour", "72hour",
"1day", "7day")
Returns:
list: Array of hourly forecast objects containing:
- DateTime: ISO timestamp for the forecast hour
- Temperature: Temperature value and unit (metric/Celsius)
- IconPhrase: Text description of weather conditions
- PrecipitationProbability: Chance of precipitation (0-100%)
"""
url = f"{BASE_URL}/forecasts/v1/{forecast_type}/{forecast_horizon}/{location_key}"
params = {"apikey": api_key, "language": "en-us", "metric": "true"}
r = requests.get(url, params=params)
return r.json()
Note that we declared two variables outside of the tool implementation. The corresponding base URL to the API endpoints and also the apikey which corresponds to the retrieved API Key from AccuWeather that we fetch via dbutils (see documentation)
BASE_URL = "https://dataservice.accuweather.com"
api_key = dbutils.secrets.get(scope="<scope-name>", key="<key-name>")
Finally, we implement these code snippets into the tool.py of our Databricks App project. If we are done, we can sync the code back to our Databricks Workspace by running
databricks sync --watch . /Workspace/Users/<username>/databricks_apps/mcp-weather-assistant -p <profile_name>
and deploy the app with the following CLI command (you can also achieve this in the UI by clicking the deploy button):
databricks apps deploy mcp-weather-assistant --source-code-path /Workspace/Users/<username>/mcp-weather-assistant
We have successfully deployed our custom weather forecast MCP server on Databricks!
For rapid prototyping and testing of our agent, we will use the Databricks Playground in the Databricks UI. MCP Servers are now fully integrated with the Playground and we can just select the servers we created above. As highlighted below, navigate to the Playground -> Add tool -> MCP servers. From there we can go ahead and include our MCP servers created above. Note that for UC functions, we select the schema that contains our function. This way, if we would create multiple functions in one schema, we can expose all of them as managed MCP server tools.
Figure 4: Building an agent prototype with MCP servers in the Databricks Playground.
As a last step, let us add a system prompt to our agent prototype that makes sure its aware about the task at hand and rules regarding output tone, clarifications needed, and the workflow with the provided tools:
Respond **only** to travel-related questions about **train connections**, **weather**, or **sports/leisure activities**.
Politely refuse anything outside these topics.
Rules
- Stay within travel, weather, sport, and leisure domains.
- If uncertain, ask for clarification or refuse.
- Be concise, factual, and well-organized.
- Always produce a short travel plan including:
- Train connection
- Weather summary
- Recommended activity
- One-line reason linking activity to weather/context
- Replace umlauts in your output.
Workflow (in particular order)
1. Ask for departure station, destination, and activity if missing.
2. Check weather (use daily forecast if >12h ahead).
3. Search local sports/leisure options and pick up to 3 fitting the weather.
4. Find train connections.
5. Output a concise final travel plan with train, weather, activity, and reason.
Refusal:
If the request is not about travel, weather, or leisure/sport: “Sorry, I can only help with travel-related questions such as train connections, weather, or sports and leisure activities.”
Finally, let’s give it a try as we want to go skiing tomorrow in famous Zermatt, Switzerland:
“I want to go skiing tomorrow (14.01.2026) in Zermatt, Switzerland. I want to start at Zurich HB in the early morning.”
Our agent reasons about our query, calls the respective four tools of our MCP servers and crafts the following response, representing a decent travel plan:
Figure 5: Agent response in Databricks Playground.
Note that the automatic creation of an agent notebook will be available soon, in conjunction with the use of MCP servers. Agent notebooks provide an out-of-the-box Python definition of your agent, including testing and evaluation, registering your agent in UC, and deploying it as an endpoint. See the documentation for more details.
We also want to showcase here how to define our agent in code so that we can then move our work to the testing, evaluation, and deployment phase. We will start by using a sample notebook from the Databricks documentation that implements an agent with LangGraph. Clone the notebook by copying the link (‘copy link for import’) and importing it with the URL option in your workspace. Going through the notebook, we will implement a few changes in the agent definition in code cell #5.
Starting with the following snippet, we want to a) change the LLM endpoint name (see full list of foundational models here) to GTP 5.2 and also, we want to include our system prompt from above:
LLM_ENDPOINT_NAME = "databricks-gpt-5-2"
llm = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME)
system_prompt = """
Respond **only** to travel-related questions about **train connections**,
**weather**, or **sports/leisure activities**.
Politely refuse anything outside these topics.
Rules
- Stay within travel, weather, sport, and leisure domains.
- If uncertain, ask for clarification or refuse.
- Be concise, factual, and well-organized.
- Always produce a short travel plan including:
- Train connection
- Weather summary
- Recommended activity
- One-line reason linking activity to weather/context
- Replace umlauts in your output.
Workflow (in particular order)
1. Ask for departure station, destination, and activity if missing.
2. Check weather (use daily forecast if >12h ahead).
3. Search local sports/leisure options and pick up to 3 fitting the weather.
4. Find train connections.
5. Output a concise final travel plan with train, weather, activity, and reason.
Refusal:
If the request is not about travel, weather, or leisure/sport: “Sorry, I can
only help with travel-related questions such as train connections, weather, or
sports and leisure activities.”
"""
As a next step, we need to configure the WorkspaceClient, which is a part of the Databricks SDK, representing a unified interface to interact with APIs of the Databricks Workspace.
In the simplest way, if you create a WorkspaceClient object in a notebook, Databricks manages the connection automatically with your workspace settings and a Personal Access Token (PAT). Nevertheless, as we are using a custom MCP server, OAuth with a service principal for machine-to-machine (M2M) authentication is required. Follow the outlined steps to retrieve the needed credentials for OAuth M2M:
We are now ready to use the SP credentials (client ID and secret stored in dbutils) to create the WorkspaceClient. In the sample notebook you can comment out or delete the standard WorkspaceClient declaration with PAT.
import os
client_id = dbutils.secrets.get(scope="<scope-name>", key="<client-id>")
client_secret = dbutils.secrets.get(scope="<scope-name>", key="<client-secret>")
mcp_server_workspace_client = WorkspaceClient(
host="https://adb-984752964297111.11.azuredatabricks.net/",
client_id=client_id,
client_secret=client_secret,
auth_type="oauth-m2m", # Enables service principal authentication
)
After a successful authentication, we leverage the DatabricksMultiServerMCPClient to define our three MCP Servers. As we are using for all of them our SP with OAuth M2M, the WorkspaceClient object from above is passed along for each server object:
host = mcp_server_workspace_client.config.host
databricks_mcp_client = DatabricksMultiServerMCPClient(
[
DatabricksMCPServer(
name="train-connection",
url=f"{host}/api/2.0/mcp/functions/travel_agent/train_agent",
workspace_client=mcp_server_workspace_client
),
DatabricksMCPServer(
name="web-search",
url=f"{host}/api/2.0/mcp/external/mcp-websearch-assistant",
workspace_client=mcp_server_workspace_client
),
DatabricksMCPServer(
name="weather",
url="https://mcp-server-try-out-984752964297111.11.azure.databricksapps.com/mcp",
workspace_client=mcp_server_workspace_client
)
]
)
The remaining code within the notebook cell is sufficient for this project, and we can run the notebook. The output of cell #5 produces an agent.py, that we can in the following import and test our agent. The next code sample invokes the agent with the same query that we used in the Playground example above. For cosmetics, we use IPython to format the output nicely in the notebook.
from agent import AGENT
from IPython.display import Markdown, display
# Test with a travel-related query
response = AGENT.predict({
"input": [{"role": "user", "content": "I want to go skiing tomorrow (14.01.2026) in
Zermatt, Switzerland. I want to start at Zurich HB in the early morning."}]
})
# Extract the text content from the response
if response.output and len(response.output) > 0:
text_content = ""
for item in response.output:
if hasattr(item, 'content') and item.content:
for content_item in item.content:
if isinstance(content_item, dict) and content_item.get('type') == 'output_text':
text_content += content_item.get('text', '')
# Display as markdown
display(Markdown(text_content))
else:
print("No response content to display")
Finally, we can inspect the output and evaluate the travel plan the agent has crafted. If we compare the output with the one from the Playground in Figure 5, we can see that the train connection and also the weather data match exactly. Upon inspecting the recommended activity, we notice that the results are not identical. The reason for this lies in the implementation differences of our agent code above and the Databricks Playground, which result in different input queries to the web search MCP server.
Figure 6: Agent response with LangGraph notebook implementation.
To ensure the reliability and efficacy of our multi-tool agent, rigorous testing and evaluation are essential before deployment to production. The Mosaic AI Agent Evaluation within Databricks offers a comprehensive and structured approach for evaluating agent performance.
The framework supports programmatic evaluation using LLM-as-a-judge patterns, allowing you to define ground truth scenarios, run the agent against them, and compute metrics automatically. This enables A/B testing between different LLM endpoints, prompt variations, or MCP server configurations. The notebook we have been working with during this project contains the basics of how to work with the evaluation framework. LLM-as-a-judge is explained in detail here.
Once the agent has met performance criteria, it is ready for deployment. The notebook contains the following two steps for the path to production:
Finally, if you want to create your own custom front-end for your agent, you can deploy a dedicated Databricks App that hosts your application and calls the deployed model serving endpoint. An example can be found in the Databricks Apps Cookbook here.
This blog post demonstrated how to build a sophisticated, agentic travel assistant application by leveraging the Model Context Protocol (MCP) on Databricks. We showcased the integration of three distinct MCP server patterns (managed, external, and custom) that allow a) to retrieve train connections from an API, b) to perform a web search with you.com, and b) obtain a weather forecast for a specific location. The showcase highlights a scalable, secure, and governed path for enterprise-wide agent development through the Databricks Agent Framework and Unity Catalog.
Key Takeaways:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.