<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes in Generative AI</title>
    <link>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119496#M892</link>
    <description>&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Karthik_Karanm_0-1747408786969.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/16915i53F44347B8998DF7/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Karthik_Karanm_0-1747408786969.png" alt="Karthik_Karanm_0-1747408786969.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;HI&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/24053"&gt;@lingareddy_Alva&lt;/a&gt;&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Databricks recommends using Unity Catalog instead of the legacy Table Access Control (TAC) feature.&lt;/STRONG&gt;&amp;nbsp;Enabling Unity Catalog requires configuring extra permissions, such as &lt;STRONG&gt;Cluster Access Control (ACLs)&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;We want to confirm whether this is the recommended approach moving forward, or if there is an alternative method to achieve the same access control functionality.&lt;BR /&gt;&lt;BR /&gt;Thank you for your time.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 16 May 2025 15:28:53 GMT</pubDate>
    <dc:creator>Karthik_Karanm</dc:creator>
    <dc:date>2025-05-16T15:28:53Z</dc:date>
    <item>
      <title>Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes</title>
      <link>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/118924#M881</link>
      <description>&lt;P&gt;&lt;FONT face="verdana,geneva" size="2"&gt;Hi Community,&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT face="verdana,geneva" size="2"&gt;I’m currently working on a Retrieval-Augmented Generation (RAG) use case in Databricks. I’ve successfully implemented and served a model that uses a &lt;STRONG&gt;single Vector Search index&lt;/STRONG&gt;, and everything works as expected.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT face="verdana,geneva" size="2"&gt;However, when I try to serve a model that utilizes &lt;STRONG&gt;multiple Vector Search indexes&lt;/STRONG&gt;, I encounter the following error during model serving:&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="2" color="#FF6600"&gt;mlflow.exceptions.MlflowException: Failed to run user code from /model/model.py. Error: Response content b'{"error_code":"PERMISSION DENIED","message":"Insufficient permissions for UC entity cd.schema.table_vs","details":[{"@type":"type.googleapis.com/google.rpc.RequestInfo","request_id":"b5c11ebd-f66d-4574-9bdc-b89bf6d06339","serving_data":""}]}', status_code 403. Review the stack trace for more information.&lt;/FONT&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;STRONG&gt;"Insufficient permission to the vector search tables"&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;FONT face="verdana,geneva" size="2"&gt;All the involved vector search indexes are accessible during the indexing and model creation phase. The issue only appears when attempting to &lt;STRONG&gt;serve the model&lt;/STRONG&gt;.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;STRONG&gt;Key Observations:&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;FONT face="verdana,geneva" size="2"&gt;Serving a model with a single vector search index works fine.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT face="verdana,geneva" size="2"&gt;Serving a model with multiple vector search indexes leads to a permission error.&lt;/FONT&gt;&lt;/LI&gt;&lt;LI&gt;&lt;FONT face="verdana,geneva" size="2"&gt;The permissions on the individual vector search tables seem to be correctly set, and accessible in other contexts.&lt;/FONT&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;FONT face="verdana,geneva" size="2"&gt;&lt;STRONG&gt;Has anyone faced a similar issue or can suggest what specific permissions might be missing when using multiple indexes in a RAG setup?&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT face="verdana,geneva" size="2"&gt;Thanks in advance!&lt;BR /&gt;&lt;BR /&gt;#Databricks #VectorSearch #RAG #MLflow #ModelServing #DatabricksPermissions #LakehouseAI #GenAI #DatabricksCommunity #MLOps&lt;BR /&gt;&lt;BR /&gt;&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 12 May 2025 16:00:57 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/118924#M881</guid>
      <dc:creator>Karthik_Karanm</dc:creator>
      <dc:date>2025-05-12T16:00:57Z</dc:date>
    </item>
    <item>
      <title>Re: Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes</title>
      <link>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/118978#M882</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/160788"&gt;@Karthik_Karanm&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This is a known issue pattern when using multiple Unity Catalog (UC) vector search indexes in&lt;BR /&gt;Databricks Model Serving — especially under MLflow model serving endpoints with RAG architecture.&lt;/P&gt;&lt;P&gt;Your model serving environment (i.e., the model inference cluster running the MLflow model)&lt;BR /&gt;does not inherit the same permissions that your interactive environment (like a notebook) does. This leads to:&lt;BR /&gt;- 403 PERMISSION_DENIED errors from Unity Catalog&lt;BR /&gt;- Even though you can query and use those vector search tables during development, the model serving endpoint runs in a separate,&lt;BR /&gt;tightly scoped environment, and likely lacks direct access to the underlying Unity Catalog assets (like schema.table_vs)&lt;/P&gt;&lt;P&gt;To resolve this, you'll need to explicitly grant access to the Unity Catalog entities (vector search tables) for the model serving principal.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 13 May 2025 01:12:09 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/118978#M882</guid>
      <dc:creator>lingareddy_Alva</dc:creator>
      <dc:date>2025-05-13T01:12:09Z</dc:date>
    </item>
    <item>
      <title>Re: Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes</title>
      <link>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119064#M883</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/24053"&gt;@lingareddy_Alva&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;&amp;nbsp;Thank you for your detailed response — it definitely helped clarify the separation between the interactive environment and the model serving environment in Databricks.&lt;/P&gt;&lt;P&gt;However, I’m still encountering the same issue even though &lt;STRONG&gt;I am the owner of all the involved entities:&lt;/STRONG&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;The &lt;STRONG&gt;Unity Catalog tables that back the vector search indexes&lt;/STRONG&gt;&lt;/LI&gt;&lt;LI&gt;The &lt;STRONG&gt;Vector Search indexes themselves&lt;/STRONG&gt;&lt;/LI&gt;&lt;LI&gt;The &lt;STRONG&gt;&lt;STRONG&gt;MLflow model and serving endpoints&lt;/STRONG&gt;&lt;/STRONG&gt;&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Tue, 13 May 2025 16:02:06 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119064#M883</guid>
      <dc:creator>Karthik_Karanm</dc:creator>
      <dc:date>2025-05-13T16:02:06Z</dc:date>
    </item>
    <item>
      <title>Re: Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes</title>
      <link>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119068#M884</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/160788"&gt;@Karthik_Karanm&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The Model Serving environment runs in an isolated, production-grade context (different compute plane than your interactive workspace).&lt;BR /&gt;Even though you own the objects, the serving runtime executes as a system service principal or service identity that:&lt;BR /&gt;-- May not inherit your personal workspace permissions&lt;BR /&gt;-- Needs explicit permissions granted to access Unity Catalog tables and Vector Search inde&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;1. Grant Permissions to the Model Serving Identity&lt;/STRONG&gt;&lt;BR /&gt;You need to manually grant SELECT privileges to the serving identity on:&lt;BR /&gt;-- The Vector Search index-backed tables&lt;BR /&gt;-- Optionally, the schemas and catalogs themselves if using fine-grained access control&lt;/P&gt;&lt;P&gt;First, identify the serving identity (it might be something like databricks-model-serving)&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;2. Enable Table ACLs in Unity Catalog (if not already)&lt;/STRONG&gt;&lt;BR /&gt;Ensure that Table Access Control (Table ACLs) is enabled in the workspace and catalog. You can check this under:&lt;BR /&gt;Admin Console → Data → Unity Catalog → Permissions → Table Access Control =On&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;3. Re-deploy or Rebuild Model After Permissions Update&lt;/STRONG&gt;&lt;BR /&gt;Sometimes permissions don't take immediate effect for a running model. You may need to:&lt;BR /&gt;-- Rebuild and log the MLflow model (if UC tags changed)&lt;BR /&gt;-- Delete and redeploy the endpoint&lt;BR /&gt;-- Or at minimum, restart the endpoint to clear cached permission&lt;BR /&gt;&lt;STRONG&gt;4. Use catalog.table Syntax Explicitly in Model Code&lt;/STRONG&gt;&lt;BR /&gt;Sometimes, serving context is sensitive to fully qualified names:&lt;/P&gt;&lt;P&gt;Extra Debugging Tip&lt;BR /&gt;To simulate the serving context, create a service principal and attach it to a job cluster or notebook using impersonation mode.&lt;BR /&gt;If that principal fails with the same error, you've validated the access mismatch.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 13 May 2025 16:25:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119068#M884</guid>
      <dc:creator>lingareddy_Alva</dc:creator>
      <dc:date>2025-05-13T16:25:45Z</dc:date>
    </item>
    <item>
      <title>Re: Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes</title>
      <link>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119363#M886</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/24053"&gt;@lingareddy_Alva&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;Thank you for your time&lt;/P&gt;&lt;P&gt;Please give me some clarification on this:&lt;/P&gt;&lt;OL&gt;&lt;LI&gt;The permission error occurred when we used multiple vector searches for a single model. During the model registration process in this scenario, we encountered the error.&lt;/LI&gt;&lt;LI&gt;However, when we used a single vector search for the same model, the registration completed successfully and everything worked as expected.&lt;/LI&gt;&lt;/OL&gt;&lt;P&gt;Could you please help us understand why this issue occurs only in the first scenario involving multiple vectors?&lt;/P&gt;</description>
      <pubDate>Thu, 15 May 2025 16:16:34 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119363#M886</guid>
      <dc:creator>Karthik_Karanm</dc:creator>
      <dc:date>2025-05-15T16:16:34Z</dc:date>
    </item>
    <item>
      <title>Re: Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes</title>
      <link>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119372#M887</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/160788"&gt;@Karthik_Karanm&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;When registering a model that references multiple Unity Catalog tables (backing the vector indexes), Databricks attempts to access and resolve all table metadata during the packaging and validation steps of registration.&lt;/P&gt;&lt;P&gt;Here’s what changes with multiple indexes:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;1. Expanded Scope of Access&lt;/STRONG&gt;&lt;BR /&gt;-- Each Vector Search index is backed by a Delta table in Unity Catalog.&lt;BR /&gt;-- Using multiple indexes causes the model registration process to attempt read metadata access across all referenced UC entities.&lt;BR /&gt;-- If any of those tables have missing permissions, even temporarily, the registration will fail.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;2. Stricter Enforcement in Model Context&lt;/STRONG&gt;&lt;BR /&gt;-- During interactive development or indexing, you're likely operating under a full-access identity (e.g., your personal workspace or notebook).&lt;BR /&gt;-- During model registration, Databricks may execute in a different context (e.g., under the job's service principal or a model registry service identity), which may not have equivalent permissions.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 15 May 2025 16:56:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119372#M887</guid>
      <dc:creator>lingareddy_Alva</dc:creator>
      <dc:date>2025-05-15T16:56:29Z</dc:date>
    </item>
    <item>
      <title>Re: Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes</title>
      <link>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119496#M892</link>
      <description>&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Karthik_Karanm_0-1747408786969.png" style="width: 400px;"&gt;&lt;img src="https://community.databricks.com/t5/image/serverpage/image-id/16915i53F44347B8998DF7/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Karthik_Karanm_0-1747408786969.png" alt="Karthik_Karanm_0-1747408786969.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;HI&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/24053"&gt;@lingareddy_Alva&lt;/a&gt;&amp;nbsp;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Databricks recommends using Unity Catalog instead of the legacy Table Access Control (TAC) feature.&lt;/STRONG&gt;&amp;nbsp;Enabling Unity Catalog requires configuring extra permissions, such as &lt;STRONG&gt;Cluster Access Control (ACLs)&lt;/STRONG&gt;.&lt;/P&gt;&lt;P&gt;We want to confirm whether this is the recommended approach moving forward, or if there is an alternative method to achieve the same access control functionality.&lt;BR /&gt;&lt;BR /&gt;Thank you for your time.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 16 May 2025 15:28:53 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119496#M892</guid>
      <dc:creator>Karthik_Karanm</dc:creator>
      <dc:date>2025-05-16T15:28:53Z</dc:date>
    </item>
    <item>
      <title>Re: Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes</title>
      <link>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119503#M893</link>
      <description>&lt;P&gt;HI&amp;nbsp;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/24053"&gt;@lingareddy_Alva&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;I forgot to mention that I am the metastore admin and workspace admin, and my serving model runs on my user.&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;</description>
      <pubDate>Fri, 16 May 2025 17:43:03 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119503#M893</guid>
      <dc:creator>Karthik_Karanm</dc:creator>
      <dc:date>2025-05-16T17:43:03Z</dc:date>
    </item>
    <item>
      <title>Re: Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes</title>
      <link>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119636#M897</link>
      <description>&lt;P&gt;The error was misleading.&lt;BR /&gt;It is related to the library we used for agent authoring.&lt;BR /&gt;The issue was resolved when we changed the library from langchain_core.runnables to langgraph.graph with some additional code changes.&lt;/P&gt;&lt;P&gt;Here are the reference links:&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/aws/en/generative-ai/agent-framework/log-agent#-specify-resources-for-automatic-authentication-passthrough-system-authentication" target="_blank"&gt;https://docs.databricks.com/aws/en/generative-ai/agent-framework/log-agent#-specify-resources-for-automatic-authentication-passthrough-system-authentication&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://docs.databricks.com/aws/en/generative-ai/agent-framework/author-agent#chatagent" target="_blank"&gt;https://docs.databricks.com/aws/en/generative-ai/agent-framework/author-agent#chatagent&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Kudos to&amp;nbsp;Jackson Turek from Databricks.&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;Ramana&lt;/P&gt;</description>
      <pubDate>Mon, 19 May 2025 15:47:43 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119636#M897</guid>
      <dc:creator>Ramana</dc:creator>
      <dc:date>2025-05-19T15:47:43Z</dc:date>
    </item>
    <item>
      <title>Re: Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes</title>
      <link>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119637#M898</link>
      <description>&lt;P&gt;Thank you&lt;/P&gt;</description>
      <pubDate>Mon, 19 May 2025 16:00:18 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119637#M898</guid>
      <dc:creator>lingareddy_Alva</dc:creator>
      <dc:date>2025-05-19T16:00:18Z</dc:date>
    </item>
    <item>
      <title>Re: Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes</title>
      <link>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119880#M902</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/40873"&gt;@Ramana&lt;/a&gt;&amp;nbsp; - can you please be more specific with the changes that were required? I am also receiving the same error and was originally using the langchain_core.runnables library and re-worked the code to not rely on it and I am still receiving the same issue when deploying. Agents works fine when running it in my notebook. My original code (listed below) stemmed from the Multi-Agent Genie system example in this link below. I originally had additional nodes including Genie but removed them to try and get the deployment to work for now.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;A href="https://docs.databricks.com/aws/en/generative-ai/agent-framework/multi-agent-genie?scid=701Vp000004h4c4IAA&amp;amp;utm_medium=programmatic&amp;amp;utm_source=google&amp;amp;utm_campaign=22507112156&amp;amp;utm_adgroup=&amp;amp;utm_content=summit&amp;amp;utm_offer=dataaisummit&amp;amp;utm_ad=&amp;amp;utm_term=&amp;amp;gad_source=1&amp;amp;gad_campaignid=22507113074&amp;amp;gbraid=0AAAAABYBeAjJBK6Yps_hSSp9sIzsxssUG&amp;amp;gclid=EAIaIQobChMI9-Lhwfi0jQMVXQCtBh3fuDyzEAAYASAAEgLm_PD_BwE" target="_blank"&gt;https://docs.databricks.com/aws/en/generative-ai/agent-framework/multi-agent-genie?scid=701Vp000004h4c4IAA&amp;amp;utm_medium=programmatic&amp;amp;utm_source=google&amp;amp;utm_campaign=22507112156&amp;amp;utm_adgroup=&amp;amp;utm_content=summit&amp;amp;utm_offer=dataaisummit&amp;amp;utm_ad=&amp;amp;utm_term=&amp;amp;gad_source=1&amp;amp;gad_campaignid=22507113074&amp;amp;gbraid=0AAAAABYBeAjJBK6Yps_hSSp9sIzsxssUG&amp;amp;gclid=EAIaIQobChMI9-Lhwfi0jQMVXQCtBh3fuDyzEAAYASAAEgLm_PD_BwE&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;import functools&lt;BR /&gt;import os&lt;BR /&gt;from typing import Any, Generator, Literal, Optional&lt;/P&gt;&lt;P&gt;import mlflow&lt;BR /&gt;from databricks.sdk import WorkspaceClient&lt;BR /&gt;from databricks_langchain import ChatDatabricks, VectorSearchRetrieverTool&lt;/P&gt;&lt;P&gt;from databricks_langchain.uc_ai import (&lt;BR /&gt;DatabricksFunctionClient,&lt;BR /&gt;UCFunctionToolkit,&lt;BR /&gt;set_uc_function_client&lt;BR /&gt;)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;from databricks_langchain.genie import GenieAgent&lt;BR /&gt;from langchain_core.runnables import RunnableLambda&lt;BR /&gt;from langgraph.graph import END, StateGraph&lt;BR /&gt;from langgraph.graph.state import CompiledStateGraph&lt;BR /&gt;from langgraph.prebuilt import create_react_agent&lt;BR /&gt;from mlflow.langchain.chat_agent_langgraph import ChatAgentState&lt;BR /&gt;from mlflow.pyfunc import ChatAgent&lt;BR /&gt;from mlflow.types.agent import (&lt;BR /&gt;ChatAgentChunk,&lt;BR /&gt;ChatAgentMessage,&lt;BR /&gt;ChatAgentResponse,&lt;BR /&gt;ChatContext,&lt;BR /&gt;)&lt;BR /&gt;from pydantic import BaseModel&lt;/P&gt;&lt;P&gt;from langchain_openai import OpenAIEmbeddings&lt;/P&gt;&lt;P&gt;mlflow.langchain.autolog()&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;############################################&lt;BR /&gt;############################################&lt;/P&gt;&lt;P&gt;LLM_ENDPOINT_NAME = "XXXX"&lt;/P&gt;&lt;P&gt;llm = ChatDatabricks(endpoint=LLM_ENDPOINT_NAME)&lt;/P&gt;&lt;P&gt;assert LLM_ENDPOINT_NAME is not None&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;client = DatabricksFunctionClient()&lt;BR /&gt;set_uc_function_client(client)&lt;/P&gt;&lt;P&gt;tools = []&lt;/P&gt;&lt;P&gt;# # TODO if desired, add additional tools and update the description of this agent&lt;BR /&gt;uc_tool_names = ["system.ai.python_exec"]#,"dev_catalog.tmp.liquidity_calculations"]&lt;BR /&gt;uc_toolkit = UCFunctionToolkit(function_names=uc_tool_names)&lt;BR /&gt;tools.extend(uc_toolkit.tools)&lt;/P&gt;&lt;P&gt;tools_agent_description = (&lt;BR /&gt;"This agent can execute python code and perform data analysis.",&lt;BR /&gt;)&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;embedding_model = OpenAIEmbeddings(model="text-embedding-3-large")&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;index_name = "dev_catalog.default.earnings_index"&lt;BR /&gt;endpoint_name = "earnings_index"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;vs_tools = []&lt;BR /&gt;vs_agent_description = ("""&lt;BR /&gt;The Earnings Vector Search agent has access to a knowledge base of earnings report data related to the company ABC&lt;BR /&gt;The knowledge base includes information from 10Ks and Investor Presentations.&lt;BR /&gt;Users will want to be able to access numerical data from these reports. This includes servicing related metrics such as delinquency&lt;BR /&gt;""")&lt;/P&gt;&lt;P&gt;vs_tool_description=(&lt;BR /&gt;"Provide users information from company earnings reports"&lt;BR /&gt;"Returns numerical results related to earnings, company performance, delinquency rates, origination volume"&lt;BR /&gt;"Avoid any explanation or commentary unless you are unsure "&lt;BR /&gt;)&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;vs_tool = [VectorSearchRetrieverTool(&lt;BR /&gt;index_name=index_name, # Index name in the format 'catalog.schema.index'&lt;BR /&gt;num_results=4, # Max number of documents to return&lt;BR /&gt;query_type="ANN", # Query type ("ANN" or "HYBRID").&lt;BR /&gt;tool_name="earnings_reports_vector_search", # Used by the LLM to understand the purpose of the tool&lt;BR /&gt;tool_description=vs_tool_description, # Used by the LLM to understand the purpose of the tool&lt;BR /&gt;text_column="text", # Specify text column for embeddings. Required for direct-access index or delta-sync index with self-managed embeddings.&lt;BR /&gt;embedding=embedding_model # The embedding model. Required for direct-access index or delta-sync index with self-managed embeddings.&lt;BR /&gt;)]&lt;/P&gt;&lt;P&gt;vs_tools.extend(vs_tool)&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;#tools.extend(earnings_vs_tool)&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;tools_agent = create_react_agent(llm, tools=tools)&lt;BR /&gt;vs_agent = create_react_agent(llm, tools=vs_tools)&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;worker_descriptions = {&lt;BR /&gt;"Earnings Vector Search": vs_agent_description,&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;formatted_descriptions = "\n".join(&lt;BR /&gt;f"- {name}: {desc}" for name, desc in worker_descriptions.items()&lt;BR /&gt;)&lt;/P&gt;&lt;P&gt;system_prompt = f"""You are the supervisor in a multi-agent system. Your job is to route the user's question to the appropriate specialist agent(s).&lt;/P&gt;&lt;P&gt;You may choose from the following workers, or select FINISH if the question has already been fully answered based on data that has been retrieved from Genie queries.&lt;/P&gt;&lt;P&gt;- If multiple agents are needed, route them one at a time and collect their answers before selecting FINISH.&lt;BR /&gt;- Maintain awareness of which agents have already responded by reviewing the message history.&lt;BR /&gt;- IMPORTANT: DO NOT CALL AN AGENT MORE THAN ONCE&lt;/P&gt;&lt;P&gt;Available agents:&lt;BR /&gt;{formatted_descriptions}&lt;BR /&gt;"""&lt;/P&gt;&lt;P&gt;options = ["FINISH"] + list(worker_descriptions.keys())&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;def supervisor_agent(state):&lt;BR /&gt;class nextNode(BaseModel):&lt;BR /&gt;next_node: Literal[tuple(options)]&lt;/P&gt;&lt;P&gt;preprocessor = RunnableLambda(&lt;BR /&gt;lambda state: [{"role": "system", "content": system_prompt}] + state["messages"]&lt;BR /&gt;)&lt;BR /&gt;supervisor_chain = preprocessor | llm.with_structured_output(nextNode)&lt;BR /&gt;return supervisor_chain.invoke(state)&lt;/P&gt;&lt;P&gt;def agent_node(state, agent, name):&lt;BR /&gt;result = agent.invoke(state)&lt;BR /&gt;return {&lt;BR /&gt;"messages": [&lt;BR /&gt;{&lt;BR /&gt;"role": "assistant",&lt;BR /&gt;"content": result["messages"][-1].content,&lt;BR /&gt;"name": name,&lt;BR /&gt;}&lt;BR /&gt;]&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;def final_answer(state):&lt;BR /&gt;system_prompt = f'''&lt;BR /&gt;Using only the content in the messages, respond to the user's question using the answer given by the other agents.&lt;BR /&gt;- You should be trying to create a report framework that has an "Introduction"&lt;BR /&gt;- It should then have a section of a high level summary, title this as "Overview"&lt;BR /&gt;- Then have a final section that says "Figures" and includes just sub-bullets with any numerical value for each requested metric&lt;BR /&gt;'''&lt;BR /&gt;&lt;BR /&gt;preprocessor = RunnableLambda(&lt;BR /&gt;lambda state: [{"role": "system", "content": system_prompt}] + state["messages"]&lt;BR /&gt;)&lt;BR /&gt;final_answer_chain = preprocessor | llm&lt;BR /&gt;return {"messages": [final_answer_chain.invoke(state)]}&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;class AgentState(ChatAgentState):&lt;BR /&gt;next_node: str&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;vs_node = functools.partial(agent_node, agent=vs_agent, name="Earnings Vector Search")&lt;/P&gt;&lt;P&gt;workflow = StateGraph(AgentState)&lt;BR /&gt;workflow.add_node("Earnings Vector Search", vs_node)&lt;BR /&gt;workflow.add_node("supervisor", supervisor_agent)&lt;BR /&gt;workflow.add_node("final_answer", final_answer)&lt;/P&gt;&lt;P&gt;workflow.set_entry_point("supervisor")&lt;BR /&gt;# We want our workers to ALWAYS "report back" to the supervisor when done&lt;BR /&gt;for worker in worker_descriptions.keys():&lt;BR /&gt;workflow.add_edge(worker, "supervisor")&lt;/P&gt;&lt;P&gt;# Let the supervisor decide which next node to go&lt;BR /&gt;workflow.add_conditional_edges(&lt;BR /&gt;"supervisor",&lt;BR /&gt;lambda x: x["next_node"],&lt;BR /&gt;{**{k: k for k in worker_descriptions.keys()}, "FINISH": "final_answer"},&lt;BR /&gt;)&lt;BR /&gt;workflow.add_edge("final_answer", END)&lt;BR /&gt;multi_agent = workflow.compile()&lt;/P&gt;&lt;P&gt;class LangGraphChatAgent(ChatAgent):&lt;BR /&gt;def __init__(self, agent: CompiledStateGraph):&lt;BR /&gt;self.agent = agent&lt;/P&gt;&lt;P&gt;def predict(&lt;BR /&gt;self,&lt;BR /&gt;messages: list[ChatAgentMessage],&lt;BR /&gt;context: Optional[ChatContext] = None,&lt;BR /&gt;custom_inputs: Optional[dict[str, Any]] = None,&lt;BR /&gt;) -&amp;gt; ChatAgentResponse:&lt;BR /&gt;request = {&lt;BR /&gt;"messages": [m.model_dump_compat(exclude_none=True) for m in messages]&lt;BR /&gt;}&lt;/P&gt;&lt;P&gt;messages = []&lt;BR /&gt;for event in self.agent.stream(request, stream_mode="updates"):&lt;BR /&gt;for node_data in event.values():&lt;BR /&gt;messages.extend(&lt;BR /&gt;ChatAgentMessage(**msg) for msg in node_data.get("messages", [])&lt;BR /&gt;)&lt;BR /&gt;return ChatAgentResponse(messages=messages)&lt;/P&gt;&lt;P&gt;def predict_stream(&lt;BR /&gt;self,&lt;BR /&gt;messages: list[ChatAgentMessage],&lt;BR /&gt;context: Optional[ChatContext] = None,&lt;BR /&gt;custom_inputs: Optional[dict[str, Any]] = None,&lt;BR /&gt;) -&amp;gt; Generator[ChatAgentChunk, None, None]:&lt;BR /&gt;request = {&lt;BR /&gt;"messages": [m.model_dump_compat(exclude_none=True) for m in messages]&lt;BR /&gt;}&lt;BR /&gt;for event in self.agent.stream(request, stream_mode="updates"):&lt;BR /&gt;for node_data in event.values():&lt;BR /&gt;yield from (&lt;BR /&gt;ChatAgentChunk(**{"delta": msg})&lt;BR /&gt;for msg in node_data.get("messages", [])&lt;BR /&gt;)&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;# Create the agent object, and specify it as the agent object to use when&lt;BR /&gt;# loading the agent back for inference via mlflow.models.set_model()&lt;BR /&gt;AGENT = LangGraphChatAgent(multi_agent)&lt;BR /&gt;mlflow.models.set_model(AGENT)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 21 May 2025 16:10:40 GMT</pubDate>
      <guid>https://community.databricks.com/t5/generative-ai/insufficient-permission-error-when-serving-rag-model-with/m-p/119880#M902</guid>
      <dc:creator>bdroesch</dc:creator>
      <dc:date>2025-05-21T16:10:40Z</dc:date>
    </item>
  </channel>
</rss>

