Advanced AI systems go beyond simple question-answering—they can interact with live data, call tools dynamically, and adapt to real-world complexities. In transportation, real-time scheduling is a perfect challenge: users need accurate, up-to-the-minute information on routes, delays, and connections.
This blog demonstrates how to build an AI-powered travel agent using Databricks to integrate real-time train data through external APIs. Rather than relying solely on a large language model (LLM), we design a system with Databricks, where the agent can retrieve live train schedules, process user requests, and return actionable insights. For that, we use the Transport API that provides train schedule data, and build an agent that can leverage two tools implemented with Unity catalog functions. Also, we show how an agent can be deployed and evaluated with the Mosaic AI Agent & Evaluation Framework.
Figure 1: High-level architecture of the used Mosaic AI Agent Framework components.
Key steps covered in this blog:
By following this approach, you’ll develop a fully functional AI assistant capable of handling real-time train queries - demonstrating how AI can bridge the gap between raw data and intelligent decision-making.
The Transport API, provided by Opendata.ch, is leveraged to retrieve public timetable data used to build the travel agent that allows the ask for train connections in Switzerland. To build a travel agent that lets users ask for train connections in Switzerland, we use public timetable data through theprovided by Opendata.ch. Note that this approach is not limited to the application of this particular API. You can build an agent in a similar fashion for other railway organizations, point-to-point traffic duration and distance with the Google Distance Matrix API overview, or for flying with the Skyscanner Developer APIs.
The Transport API builds on REST and provides three endpoints to gather location, connection, and station board data:
Detailed documentation about the endpoints can be found here. Note that the API allows for an extension to include other transport modes. Building this AI agent, we focus on train connections and station board data, i.e., we leverage the API endpoints /connections and /stationboard.
An agent leverages an LLM as an engine to reason about actions to take. An action involves utilizing tools that allow the agent to, e.g., search the web, retrieve certain internal or external data (e.g., product documentation), or call APIs. Different tools can be created as interfaces for an agent to use when useful for the task at hand.
In this blog, we will build our tools with Unity Catalog functions. Unity Catalog is a core part of the Databricks Data Intelligence Platform and provides a centralized governance and management system across multiple workspaces, ensuring data consistency and security. Hence, we can benefit from these core advantages using Unity Catalog functions:
This way, we organize, manage, and govern our Unity Catalog functions in the same fashion as we already do with our data, models, and other assets made available through the Unity Catalog. For functions in particular, the owner can grant EXECUTE permissions to users or service principals, allowing the function to be used. In our case, we utilize Unity Catalog functions to create AI agent tools that execute custom logic and perform specific tasks that extend the capabilities of LLMs beyond language generation - in our case, train connection retrieval.
In general, there are four types of agent tools that we are going to distinguish here:
In the following example, we will focus on external connection tools as we need to connect to the Transport API and implement custom Python code that handles the request and processes the response before returning information to the agent. Nevertheless, the flexibility outlined above highlights that there are no limits when it comes to defining, managing, and especially governing your custom agent tools.
To integrate your train connection agent with Unity Catalog, you'll need to first create a catalog and a schema for your project. This allows you to segregate your projects and also to perform fine-grained governance. We will do this here with two simple SQL statements in a notebook:
%sql
CREATE CATALOG IF NOT EXISTS travel_agents;
CREATE SCHEMA IF NOT EXISTS train_agent;
In this small project, we will create two functions that the agent can use as tools. First, we want to retrieve train connections from the Transport API. Given the documentation of the connections endpoint, a destination and arrival location are required for the API request. Nevertheless, we will also implement the via option in case a user prompts the agent with, e.g., ‘I want to go from Zurich to Geneva but I need to make a stop in Luzern’.
The Transport API does not require any authentication. Therefore, we don’t need to create a connection here. Nevertheless, if you want to use an external service where authentication is required, you can leverage Unity Catalog connections to do so.
We start off by utilizing CREATE FUNCTION and implementing the prototype of the function, i.e., we specify the three parameters and also the return object as a string. Important to note here is that we leverage the comment keyword to document a) what the input parameter represents and b) what functionality is provided by this function. This is particularly important as the agent uses this information to reason about the decision if a tool is the right choice for a given task.
The core functionality of the defined functions is Python. What follows is a typical Python implementation of a REST API call with the requests package. We declare a dictionary for the parameters, where the keys represent the parameters specified in the API documentation and the values correspond to the Unity catalog function input parameters. Finally, we process the payload by returning the content of the connection element. If the request fails, an error message is returned to the agent.
%sql
CREATE OR REPLACE FUNCTION travel_agents.train_agent.get_connections(
from_station STRING COMMENT 'The train station of departure',
to_station STRING COMMENT 'The train station of arrival',
via_station STRING COMMENT 'The desired stops in between departure and arrival'
)
RETURNS STRING
COMMENT 'Executes a call to the transport API and connections endpoint to retrieve relevant train connections given the input parameters from (departure), to (arrival), via (stops in between, if specified).'
LANGUAGE PYTHON
AS $$
import requests
url = "http://transport.opendata.ch/v1/connections"
params = {
"from": from_station,
"to": to_station,
"via": via_station,
"transportations": "train"
}
response = requests.get(url, params=params)
if response.status_code == 200:
next_connection = response.json()
return next_connection['connections']
else:
return f"Failed to retrieve connection. Status code: {response.status_code}"
$$;
In a very similar fashion, we create another function that retrieves the station board data of a given train station. For that, we leverage the stationboard endpoint of the Transport API. The station parameter as a string is required and we utilize the optional parameter describing if the user wants to get arrival or departure data:
%sql
CREATE OR REPLACE FUNCTION travel_agents.train_agent.get_station_board(
station STRING,
arrival_or_departure STRING
)
RETURNS STRING
COMMENT "Returns station board"
LANGUAGE PYTHON
AS $$
import requests
url = "http://transport.opendata.ch/v1/stationboard"
params = {
"station": station,
"type": arrival_or_departure,
"limit": 15,
"transportions": "train"
}
response = requests.get(url, params=params)
if response.status_code == 200:
station_board = response.json()
return station_board
else:
return f"Failed to retrieve connection. Status code: {response.status_code}"
$$;
Note that the hard-coded values for the limit of returned records and the transportation mode can be changed according to the documentation or introduced as function parameters. The code for the UC functions can also be found in the GitHub repository here.
Finally we check in Unity Catalog if our functions are created as expected. Given the three-level namespace of Unity Catalog, we expect a catalog travel_agents, a schema train_agent and then the two functions get_connections and get_station_board.
Figure 2: Overview of created functions in the Unity Catalog explorer.
In this section, we will walk through the step-by-step process of creating an AI agent in Databricks Playground, integrating it with real-time train schedule data from the Swiss rail network. By the end of this guide, you will have a working AI agent capable of answering queries like:
"I want to go from Zurich to Geneva now. Is there a train leaving in the next hour?"
Before integrating the tools into an AI agent, we test them in the Databricks Playground to ensure they work as expected. Testing in a controlled environment ensures the agent has reliable access to real-time data before deployment. More details can be found here.
Step1: Access Databricks Playground
Before adding tools to the AI agent, we must set up the environment and verify tool functionality. The Databricks Playground provides a controlled space to test and validate tools before full deployment.
Figure 3: Navigating to the Databricks Playground
Step 2: Add Tools to the AI Agent
Now, we integrate the travel planning tools created above into the AI agent. These tools allow the agent to find train connections and retrieve station board data from any train station in Switzerland in real-time.
Figure 4: Selection of tools as Unity Catalog functions in the Playground.
Step 3: Adding a system prompt
A system prompt needs to be specified to define the agent's behavior, set constraints, and provide context. For this travel AI agent, the system prompt takes care of the following:
The complete prompt template can be found in the GitHub repository here. To add the system prompt to our test environment:
Step 4: Test and Validate the Agent
Before deploying the AI agent, we must ensure it correctly interprets queries and calls the right tools.
Example:
Prompt: "Find the next train from Zurich to Bern."
The agent should utilize the tool get_connections and return relevant train connections.
Example:
Prompt: "What are the next three trains leaving from Geneva?"
The agent should utilize the tool get_station_board with the departure parameter and extract the next three connections from the retrieved payload.
If the response is incorrect, verify:
The Unity Catalog functions return expected results.
The agent is correctly calling the tools based on input queries.
Step 5: Export and Deploy the AI Agent
Once the AI agent functions as expected, we export the setup into a Databricks Notebook for deployment.
Figure 5: Exporting of the automatically generated agent implementation.
By following these steps, you now have a fully functional AI-powered travel assistant, capable of answering real-time train schedule queries using Databricks Playground and the Swiss Transport API. The driver notebook allows to define, test, evaluate and deploy the agent. Be aware that the notebook contains a few sections where information needs to be filled in (e.g., the catalog, schema and model name where the agent should be deployed to). The code with these steps incorporated is available in the GitHub repository here.
When you export your AI agent using the driver notebook in Databricks, both Agent Evaluation and the Review App are automatically integrated. This enables you to assess and refine the agent before deployment, ensuring that it is fully optimized and ready for real-world queries.
Agent evaluation is a crucial process for ensuring that the AI agent performs as expected, delivering accurate, efficient, and reliable responses. The Mosaic AI Agent Evaluation framework helps automate the process of testing and assessing agent performance, optimizing the system before going live. It evaluates the agent’s effectiveness based on key criteria like accuracy, consistency, and efficiency.
Here, we specified three global guidelines according to the documentation's example: a rejection guideline to avoid rejecting unrelated queries, a conciseness guideline meaning that the response must include a train line and details about the journey, and one guideline to ensure that the response is professional. Per default, the evaluation framework checks also for relevance and safety. The corresponding code is in the GitHub repository here.
Figure 6: Evaluation framework example with custom-defined global guidelines.
Along with the framework, the Agent Review App allows users to simulate real-world queries and assess the agent's performance. It provides an interactive space to evaluate responses, identify potential issues, and fine-tune the agent. If you execute all cells of the driver notebook that we automatically exported above from the playground, you get an automatically generated link to the review after deploying your agent. Alternatively, you can also invoke the Agent Review app with the code snippet here:
import mlflow
from databricks.agents import review_app
# The review app is tied to the current MLFlow experiment.
mlflow.set_experiment("same_exp_used_to_deploy_the_agent")
my_app = review_app.get_review_app()
print(my_app.url)
print(my_app.url + "/chat") # For "Chat with the bot".
By leveraging these tools, you can continuously monitor and improve the AI agent, ensuring it meets the desired performance standards. This makes it easier to deploy reliable and high-performing AI agents that are equipped to handle real-world queries effectively.
To ensure accurate and trustworthy responses, AI agents could leverage retrieval-augmented generation (RAG) and integrate with structured data sources. By using external databases, APIs, or enterprise knowledge bases, AI agents can dynamically fetch real-time information instead of relying solely on pre-trained knowledge, reducing hallucinations and improving decision-making. Here is an example of calling RAG with tools in Databricks.
AI agents should be assessed regularly to measure accuracy, response quality, and efficiency. Databricks Mosaic AI Agent Evaluation Framework provides structured evaluation methods, allowing developers to refine agent behavior, debug issues, and improve tool usage. Iterative testing in Databricks Playground ensures robust performance before deployment.
AI agents must be transparent, secure, and aligned with ethical AI principles. Databricks Unity Catalog enforces governance with role-based access control, while the Databricks Review App helps users analyze and refine AI agent outputs. For organizations prioritizing fairness, accountability, and bias mitigation, Databricks provides comprehensive resources on Responsible AI to guide best practices.
By following these principles, businesses can deploy AI agents that are reliable, scalable, and aligned with enterprise goals.
In this blog, we demonstrate the ease of building AI-powered travel agents using Databricks. By using the Data Intelligence Platform and in particular the Mosaic AI Agent Framework, we've created a sophisticated AI assistant capable of handling real-time train queries for the Swiss rail network.
Key takeaways from this guide include:
This project serves as a foundation that can be easily extended and adapted for various transportation applications or other domains with AI agent applications. Moreover, the work presented in this blog can be further enhanced using Databricks Apps, allowing for the creation of interactive, user-friendly interfaces for your AI agents. With Databricks Apps, you can deploy your travel assistant as a full-fledged application, complete with custom UI elements, making it even more accessible and valuable to end-users.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.