Everyone wants to use Generative AI to solve business problems, but developers have realized that simply deploying a large language model isn’t going to be enough to cross the finish line. Business problems are complex and require a solution capable of seamlessly handling different types of tasks. We call these solutions compound AI systems.
In this blog post, we'll walk through a compelling Databricks demo that showcases the capabilities of the Mosaic AI platform. This demo highlights Unity Catalog functions as tools, out-of-the-box AI functions, and the Mosaic AI Agent and Evaluation framework, demonstrating how these components work together to create a powerful compound AI system. This sounds like a lot, and it is, but we’ll show how the Databricks platform makes it easy to do all of this by following these 3 steps:
Tools are an extension of large language models’ ability to produce text. By empowering the model to take actions like retrieving documents or make API calls to file a support ticket, adding functionality that enables AI to run your business processes is the only path to differentiating from the competition. There are many libraries and approaches for tool calling such as DSPy, Langchain, and CrewAI, and although each of those runs on Databricks we’ll focus on using Unity Catalog functions from MLflow due to their ease of use and integration with the broader Databricks platform.
Unity Catalog functions allow users to define custom logic in SQL or Python. These functions follow the Unity Catalog 3-tier namespace, so they can be securely governed and shared like all other Unity Catalog objects (e.g. tables, volumes, and models). Only users that have the EXECUTE permission on your function and USAGE permissions on the schema and catalog will be able to run your function. For this demo, we have created a catalog and schema where all of our tools will live.
We’ll start by introducing three tools written in SQL that leverage AI capabilities: AI forecasting, similarity search, and text classification. AI_FORECAST can be used to make predictions about time series in a simple way. For example, given a table with IOT data coming in from engines with hourly timestamp, output, and ID columns, we can run the following to forecast out future energy output:
SELECT *
FROM AI_FORECAST(
TABLE(josh_melton.iot_schema.engine_output),
horizon => date_add(current_date(), 1),
time_col => 'hourly_timestamp',
value_col => 'avg_output',
group_col => 'engine_id'
)
But what if we wanted to make this available to other users, or even AI agents, more seamlessly? We can simply wrap that in a CREATE FUNCTION, being sure to use descriptive naming and comments so agents will understand when to use the function:
DROP FUNCTION IF EXISTS josh_melton.tools_demo.forecast_energy_output;
CREATE FUNCTION josh_melton.tools_demo.forecast_energy_output()
RETURNS TABLE (
hourly_timestamp TIMESTAMP,
engine_id STRING,
avg_energy_forecast FLOAT,
avg_energy_upper FLOAT,
avg_energy_lower FLOAT
) COMMENT "Returns the forecasted energy output through the given date" RETURN
SELECT *
FROM AI_FORECAST(
TABLE(josh_melton.iot_schema.engine_output),
horizon => date_add(current_date(), 1),
time_col => 'hourly_timestamp',
value_col => 'avg_output',
group_col => 'engine_id'
)
Now we can empower our agents to forecast into the future, and other SQL users can run a two-liner to do the same:
SELECT *
FROM josh_melton.tools_demo.forecast_energy_output()
Similarly, we’ll create another function using ai_classify() to classify the urgency of support tickets. The ai_classify() function accepts two arguments: the content that needs to be labeled and a list of labels.
CREATE OR REPLACE FUNCTION josh_melton.tools_demo.ai_classify_urgency(ticket_text_input STRING)
RETURNS STRING
RETURN
SELECT ai_classify(ai_classify_urgency.ticket_text_input, ARRAY('very urgent', 'somewhat urgent', 'not urgent')) AS urgency
For our last SQL tool, we’ll create a similarity search function. Databricks provides a SQL vector_search() function, which allows you to query a vector search index using SQL. It takes in 3 arguments:
For this demo, we’ve already created a vector search index and loaded it wth customer service tickets. If you haven’t created one yet, follow these steps.
Here’s how we can implement our similarity search function, which takes the query string as input:
CREATE FUNCTION josh_melton.tools_demo.similarity_search (
query STRING COMMENT "The string to search for similar tickets to" DEFAULT "Turbochargers malfunctioning due to overheating"
) RETURNS TABLE (
ticket_number STRING,
issue_description STRING
) COMMENT "Returns the support ticket issues related to the query" RETURN
SELECT ticket_number, issue_description
FROM VECTOR_SEARCH(index => "josh_melton.tools_demo.customer_service_tickets_index",
query => similarity_search.query,
num_results => 3)
Python UDFs in Databricks Unity Catalog extend our data capabilities far beyond traditional SQL, allowing us to leverage the power and flexibility of Python. Next, we’ll define two more tools, tell_data_joke() and update_ticket() using Python.
The first python tool, tell_data_joke()uses the Python requests library to make HTTP calls to an external API. In this example, we call the icanhazdadjoke.com API, which returns a random joke. This lighthearted example demonstrates the ability to interact with external services, which could be for fetching real time information or integrating with other relevant microservices.
CREATE FUNCTION josh_melton.tools_demo.tell_dad_joke()
RETURNS STRING COMMENT "Returns a dad joke"
LANGUAGE PYTHON
AS $$
import requests
url = "https://icanhazdadjoke.com/"
headers = {"Accept": "application/json"}
response = requests.get(url, headers=headers)
if response.status_code == 200:
joke_data = response.json()
return joke_data['joke']
else:
return f"Failed to retrieve joke. Status code: {response.status_code}"
$$;
We can use any standard Python library included in Azure Databricks in our python functions to do a variety of tasks such as data tokenization, data masking, and complex calculations. In our second example, update_ticket(), for demo purposes, we are just going to hard code a response with some string interpolation. When the function is called with a ticket number and urgency, we’ll return a string indicating the updates have been made.
CREATE FUNCTION josh_melton.tools_demo.update_ticket(
ticket_number STRING COMMENT "The ticket number to update",
urgency STRING COMMENT "The urgency or severity of the ticket"
)
RETURNS STRING COMMENT "Updates the ticket urgency status with the given ticket_number, or returns a status"
LANGUAGE PYTHON
AS $$
return f"Ticket {ticket_number} updated with urgency '{urgency}'"
$$;
Now that we have all of our tools defined, it’s time to test them! The Databricks AI Playground offers an easy to use UI for interacting with LLMs and calling our tools. Here’s what we’ll do to configure our compound AI system:
Now we can test out the agent, augmented with the tools we’ve added. When we try asking about issues related to a certain topic, our model can retrieve context from our similarity search function and use the most up to date information to answer the question:
Similarly, we can ask it to classify the urgency of the second ticket, and even update the status of the ticket if it’s highly urgent:
While the generic llama 70b model isn’t the most powerful LLM in isolation, it doesn’t take PhD level intelligence to know that a user asking “here’s a tool for ticket classification, now classify this ticket” wants you to use the tool to do classification, or that a highly urgent ticket should probably be updated as soon as possible. By augmenting smaller, faster, and cheaper models with tools we can provide more value than slow, expensive, generically intelligent models.
Now that we’ve developed and tested our agent, we’ll want to deploy it for others to use. We can do that by clicking the “export” button at the top of the playground screen, adding in the catalog, schema, and model names to the driver notebook (find the “TODO” comment), and clicking “run all” (more details here). If you’ve included tools that require access to assets like the vector search index, you’ll also need to grant the agent permission to use those (the functions themselves are added automatically).
from mlflow.models.resources import DatabricksFunction, DatabricksServingEndpoint, DatabricksVectorSearchIndex
resources = [DatabricksServingEndpoint(endpoint_name=config.get("llm_endpoint")),
DatabricksVectorSearchIndex(index_name="josh_melton.tools_demo.customer_service_tickets_index")]
Running the notebook will log, register, and deploy the agent as an MLflow model, along with creating a front end interface to try it out. The deployment process will take around 20 minutes.
The Databricks AI Playground provides an intuitive interface for testing and refining our compound AI system. This environment allows for rapid prototyping and iteration, enabling developers to quickly validate their AI solutions before deployment.
By leveraging Unity Catalog functions as tools, we've shown how to extend the capabilities of large language models beyond mere text generation. These tools, whether implemented in SQL or Python, allow AI to perform specific tasks such as forecasting, classification, similarity search, and even interacting with external APIs.
Throughout this demo, the Mosaic AI platform demonstrates its value in several key areas:
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.