Hey y'all !
So I'm experimenting with the Databricks' DatanircksVectorSearch class in Python to serve as a tool that can be used by an agent. When I run it on a notebook, I get the following error:
"[NOTICE] Using a notebook authentication token. Recommended for development only. For improved performance, please use Service Principal based authentication. To disable this message, pass disable_notice=True."
But I do correctly get an output in the notebook. But when I serve it, and test it via the Playground, the agent tells me that it has encountered an issue about authentification and I think it has to do with this notice but I'm not really sure and I don't know how to fix it...
Here's my code:
# -----------------------------
# Initialize the Vector Search tool
vs_tool = DatabricksVectorSearch(
index_name="dbw_genai_sbx_databricks.unstructured.pdf_docs_chunked_index",
columns=["chunk_id", "doc_uri", "content_chunked"],
)
# -----------------------------
# Function to retrieve information
def retrieve_infs_for_databricks(query: str) -> list[dict[str, Any]]:
docs = vs_tool.similarity_search(query=query)
res = []
for doc in docs:
res.append({
"content": doc.page_content,
"source": doc.metadata["doc_uri"],
"chunk_id": doc.metadata["chunk_id"],
})
print(f"Res:\n{res}")
return res
# -----------------------------
# LangChain tool definition
tool = Tool(
name="databricks_docs_retriever",
func=retrieve_infs_for_databricks,
description="""
Searches for infos for Databricks products from the Databricks documentation.
It accepts a string query which is the main keywords to search for.
And it returns a list of dictionaries with the following keys:
- content: contains the retrieved information
- source: contains the URL of the document where the information was retrieved
- chunk_id: contains the ID of the chunk where the information was
When it returns a source, you should also cite the source in your final result.
"""
)
print("----------------------------------------------------------------------")
# -----------------------------
# Initialize LangChain LLM with tools
llm = ChatDatabricks(endpoint="databricks-meta-llama-3-3-70b-instruct")
agent = create_react_agent(
llm,
tools=[tool],
)
# -----------------------------
# Example usage
agent.invoke({
"messages": [
{"role": "user", "content": "Based on the Databricks documentation, What is the best way to develop an AI agent? And give me the sources"}
]
})