Databricks Community

Sarbani · ‎05-13-2025

Introduction:

The rapid advancements in Artificial Intelligence (AI) are reshaping the landscape of automation, and at the heart of this transformation is the rise of Generative AI Agents.

These agents, powered by large-scale models, are not only automating routine tasks but also introducing new ways of reasoning, decision-making, and interaction that were previously unimaginable.

AI agents can plan, memorise, reason, and act autonomously to achieve pre-defined goals over multiple interactions.

In this blog, we will dive into how these cutting-edge systems are reshaping the financial domain and what it takes to build reliable, domain-specific agents that deliver real value in today’s fast-paced world.

Whether you’re a developer, data scientist, or AI architect, this guide will walk you through building production-grade agents using the Databricks Mosaic AI Agent Framework—from foundational concepts to real-world deployment.

Motivations:

AI agents have the power to change how we work, learn, and interact with the world. However, building these agents is not easy, especially when making them reliable and domain specific. Because of this, many companies focus on creating specialised agents designed for specific tasks.

These agents often rely on enterprise business data, external APIs, a mix of custom code, rules, and careful prompt design to work effectively.

In the competitive world of Global Finance, portfolio managers face immense pressure to make data-driven decisions in real time. But traditional methods of stock analysis are slow, manual, and prone to missed opportunities.

Modern financial institutions are turning to AI to revolutionize investment strategies, combining speed, accuracy, and personalization. But old, manual processes and lack of real-time insights slow down decision-making. Portfolio managers need tools that combine speed and intelligence.

Solution Approach

AI based “Investment Assistant Agent” tailored for portfolio managers and automates stock analysis, processes real-time market data, extracts insights from historical data and provides intelligent recommendations.

By leveraging AI to streamline the first level of decision-making, portfolio managers can focus on more strategic initiatives, reduce analysis time, and improve investment outcomes, driving both efficiency and profitability.

What are AI Agents

An AI agent is an autonomous software system designed to interact with its environment, perceive data, and take actions to achieve specific goals.

These agents simulate intelligent behaviour by continuously learning from their experiences and adjusting their actions based on new information.

Key components of AI agent

Agent Tools & Functions

Agent tool is essentially a function, i.e. “Tool Calling” is the process in which models predict. Tool Calling does not imply execution of the function; the model simply generates parameters that can be used to call the function. The code can then choose how to handle it, likely by calling the indicated function.

Agent Frameworks

Agent frameworks are essential for building, managing, and scaling AI agents in complex applications. They provide enhanced control over workflows, support for multi-agent collaboration, scalability, debugging tools, and seamless integration with external tools.

AI agents are categorised based on their complexity, decision-making processes, and how they interact with their environment. The main types of AI agents are:

Image source: https://langchain-ai.github.io/langgraph/concepts/multi_agent/#multi-agent-architectures

Mosaic AI Agent Framework

This framework is designed for rapid experimentation and deployment while maintaining control over data sources.

Databricks Mosaic AI introduces Mosaic AI Agent Framework integrated with MLflow for building high-quality Generative AI applications.

It comprises a set of tools on Databricks designed to help developers build, deploy, and evaluate production-quality agents like Retrieval Augmented Generation (RAG) applications, Text-to-SQL agent, data analyst agent, customer support agent, research agent, business operation assistant, advisory agent and many more.

It focuses on a robust evaluating agent performance through human feedback loops, cost/latency trade-offs, and quality metrics.

It is compatible with third-party frameworks like LangChain/Langgraph and LlamaIndex and leverage Databricks’ managed Unity Catalog, Agent Evaluation Framework, MLFLOW, Model Serving, and other platform benefits.

Solution Architecture

Here we are going to build a Single Agent using Databricks Agent Framework.

Agent Tools

Finance_stock_quote and Finance_insight_api: Agent accesses these tools to extract real time stock price and other stock analysis from external market data providers, I used Yahoo Finance API.
Get_hist_closing_price: Agent interacts with organisational databases using this tool and extracts the historical stock price reference data.
Cust_investment_pref: This tool helps the agent to retrieve the customer’s investment preferences/risk appetite from Delta Table and provide personalised recommendations for specific stocks.

Agent output

Agents can analyse this information and recommend Stock “Buy”/”Sell”/Hold” and provide justification for the recommendation. This is done by LLM general knowledge and Prompt engineering.
The agent response is evaluated against the ground truth dataset using Databricks Mosaic AI Evaluation Agent . In real case evaluation dataset must be prepared/verified by the SMEs ( eg portfolio managers in this case).
Agent must be working with “human-in-the-loop” to avoid financials risks eg. Online Evaluation using Databricks Review App.

Note:

We are not looking for accuracy at this stage as long as the Agent is meeting basic expectations of stock analysis and recommendation we will be able to validate our Databricks Agent Framework.

This is not intended for real stock market investments or trading purposes, nor does it constitute financial, investment, tax, legal, accounting, or other professional advice. This use case is solely for demonstrating the capabilities of the Databricks Agent Platform, with a focus on technical components and high-level solution architecture.

Source code link is provided at the end.

Data Preparation :

External Market Data Provider :

Financial insights and stock quote prices are extracted from Yahoo Finance API .

Get access to : Yahoo Finance API

Create your own API-KEY ( free key will have limitations, please check the dashboard, link below)

Access to Yahoofinace dashboard for API specifications.

Historical Data Stored in Delta Table :

I have uploaded the csv files of historical stock prices (source: Kaggle dataset) in Volumes of Databricks Unity Catalog .

Then created a Delta table from these csv files.

Customer Investment Preference Data

Created synthetic data for customer investment preferences using Python Faker library and uploaded them in another Delta table.

Refer to github for the data preparation scripts.

Note : Please create your own data as per your business requirements .

Create Agent tools

Function Calling: Function calling allows LLMs to generate structured responses more reliably. This capability allows us to use an LLM as an agent that can call functions by outputting JSON objects and mapping arguments. Function calling is explained in my previous blog.

LLMs are not deterministic and trained on general knowledge from the internet. Generic LLMs can not access real time data or organisational data.

For this solution ,we will create four Unity Catalog function that executes SQL code and Python code.

Run the codes in a notebook cell. It uses the %sql notebook magic to create a SQL based Unity Catalog function and use %python for Python function.

Python function :

finance_insight_api : Get stock summary/insights .
finance_stock_quote : Check the current stock price . Input parameter is the stock ticker.

SQL Function:

get_hist_closing_price : Extract historical stock volume and closing price for previous years. The year duration is passed as a parameter. Other input parameter stock ticker.
cust_investment_pref : Retrieve customer risk appetite and other preferences only if a valid customer ID is given by the user . Input parameter customer id.

Prototype tool-calling agents in AI Playground

After creating the Unity Catalog function, use the AI Playground to give the tool to an LLM and test the agent. The AI Playground provides a sandbox to prototype tool-calling agents.

Once you’re happy with the AI agent, you can export it to develop it further in Python or deploy it as a Model Serving endpoint as is.

The basic agent package is an auto-generated notebook (driver) created by Databricks AI Playground export. Please refer to the driver notebook in the github link for explanation and steps.

Code-based MLflow logging: The chain’s code is captured as a Python file. The Python environment is captured as a list of packages. When the chain is deployed, the Python environment is restored, and the chain’s code is executed to load the chain into memory so it can be invoked when the endpoint is called.

The agent is using Langchain — LangGraph framework. For other Agent authoring patterns please refer to Author AI agents in code

In order to log a model from code, you can leverage the mlflow.models.set_model() API. This API allows us to define a model by specifying an instance of the model class directly within the file where the model is defined.

When logging such a model, a file path is specified (instead of an object) that points to the Python file containing both the model class definition and the usage of the set_model API applied on an instance of the custom model.

Driver Notebook:

We are using ChatAgent to build the agent.

Databricks recommends using MLflow's ChatAgent interface to author production-grade agents. This chat schema specification is designed for agent scenarios and is similar to, but not strictly compatible with, the OpenAI ChatCompletion schema. ChatAgent also adds functionality for multi-turn, tool-calling agents.

Authoring your agent using ChatAgent provides the following benefits:

Advanced agent capabilities

Streaming output: Enable interactive user experiences by streaming output in smaller chunks.
Comprehensive tool-calling message history: Return multiple messages, including intermediate tool-calling messages, for improved quality and conversation management.
Tool-calling confirmation support
Multi-agent system support

Streamlined development, deployment, and monitoring

Framework agnostic Databricks feature integration: Write your agent in any framework of your choice and get out-of-the-box compatibility with AI Playground, Agent Evaluation, and Agent Monitoring.
Typed authoring interfaces: Write agent code using typed Python classes, benefiting from IDE and notebook autocomplete.
Automatic signature inference: MLflow automatically infers ChatAgent signatures when logging the agent, simplifying registration and deployment. See Infer Model Signature during logging.
AI Gateway-enhanced inference tables: AI Gateway inference tables are automatically enabled for deployed agents, providing access to detailed request log metadata.

Create agents

Create & log agents using any library and MLflow. Parameterise agents to experiment and iterate on agent development quickly.

Configure LLM Endpoint, System Prompt and Define Agent Tools. Calle mlflow.langchain.autolog() to view the trace for each step the agent takes.

Prompt Engineering

The system prompt is all about crafting precise, role-based instructions that enable the agent to autonomously handle complex, multi-step tasks with accuracy, transparency, and safety.

Here the prompt clearly assigns the agent the role of an investment assistant, which ensures all actions and responses are grounded in financial best practices and relevant context.

The prompt lays out a sequential process for the agent to follow-identifying tickers, gathering customer preferences, analyzing stock data, and generating recommendations. This structure reduces ambiguity, minimizes errors, and ensures consistent, high-quality outputs.

By instructing the agent to only use explicitly provided information (e.g., customer IDs) and never to assume or fabricate data, the prompt enforces strict privacy and security standards.

The agent is guided to clearly communicate any missing information and base recommendations only on available data, enhancing user trust.

Define Agent using Langgraph

Log agents in MLFLOW

Agent tracing

Lets you log, analyze, and compare traces across your agent code to debug and understand how your agent responds to requests. Since this notebook used mlflow.langchain.autolog() we can view the trace for each step the agent takes.

Test the agent :

Using MLflow Tracing we can log, analyze, and compare traces across different versions of generative AI applications. It allows us to debug the generative AI Python code and keep track of inputs, tool calls and responses.

Evaluate Agent

Evaluate using mlflow.evaluate() API with databricks evaluation agent, registers agent to Unity catalog, and deploys the agent to a Model Serving endpoint. We can edit the requests or expected responses in the evaluation dataset and run evaluation as we iterate the agent, leveraging mlflow to track the computed quality metrics.

Run mlflow.evaluate : Link to “View evaluation results” will be displayed upon successful execution of mlflow.evaluate() function.

Deploy the Agent

Deploy agents to production with native support for token streaming and request/response logging, plus a built-in review app to get user feedback for your agent.

Successful deployment will show the link to the model serving endpoint to view the status ( generally takes 5–10 mins to start) and get the Review App link (as shown below).

Let’s test the review app : should I invest in Tesla stocks for customer id 1540?

Databricks Review APP

Get feedback about the quality of an agentic application using Databricks Review App.

The Databricks Review App offers a robust platform for integrating domain experts and SMEs into the agent development lifecycle through a human-in-the-loop approach. In this controlled environment, stakeholders such as business users and subject matter experts can interact directly with the agent—engaging in conversations, asking questions, and providing critical feedback. Every interaction is recorded in an inference table, enabling detailed performance analysis. This continuous feedback loop helps ensure the quality, safety, and reliability of the agent’s responses while fostering greater trust and adoption.

This agent is providing a decent answer with this set up.

Source Code

The codes are available in the github link

Folder : 2025-04-investment-assistant-with-databricks-agent-framework

Conclusion

This blog includes a practical approach to build a domain specific AI Agent for investment recommendation .

While the prototype agent showcases promising capabilities, such as personalised stock recommendations and real-time insights, it is essential to emphasise the importance of a “human-in-the-loop” approach. This ensures that financial risks are mitigated and the system evolves based on expert feedback.

Moving forward, incorporating additional data sources, refining evaluation metrics, and aligning with organisational strategies and multi-agent architecture can further enhance the agent’s accuracy and reliability.

The Databricks Mosaic AI Agent Framework demonstrates its potential as a robust platform for building intelligent, autonomous agents capable of handling complex tasks like financial analysis. By integrating tools such as Unity Catalog, Agent tools/Function, MLflow, Model Serving, Review App and the AI Playground, it enables rapid experimentation, seamless deployment, and comprehensive evaluation of AI agents.