- 573 Views
- 2 replies
- 1 kudos
Resolved! Hosted gpt-oss endpoint system prompts contain "You have no access to tools"
We are using the foundation model endpoints (provisioned and pay-per-token) for the gpt-oss models for a research project. We have been experiencing consistent tool call failures: gpt-oss-20b was failing ~1 out of every 12, while gpt-oss-120b was fai...
- 573 Views
- 2 replies
- 1 kudos
- 1 kudos
Hi @satusky, I'm not an expert in this area, but after some internal research, I wouldn't treat a leaked system-prompt snippet as a product issue related to tool use. I can’t comment on the exact internal prompt template for a specific invocation, bu...
- 1 kudos
- 2110 Views
- 5 replies
- 1 kudos
Resolved! Problems with unstructured_data_pipeline
Hi everyone,I'm currently working with the unstructured data pipeline in Databricks, using the official notebook provided by Databricks without any modifications. Strangely, despite being an out-of-the-box resource, the notebook fails during executio...
- 2110 Views
- 5 replies
- 1 kudos
- 1 kudos
Hi @Mariano-Vertiz - Which access mode are you using for your cluster - dedicated or standard? I think it is failing as a standard cluster does not allow the low-level operation it is trying to perform in cell 42. Is that where it's failing? I tried ...
- 1 kudos
- 925 Views
- 1 replies
- 0 kudos
How to utilize clustered gpu for large hf models
Hi,I am using clustered GPU(driver -1GPU and Worker-3GPU), and caching model data into unity catalog but while loading model checkpoint shards its always use driver memory and failed due insufficient memory.How to use complete cluster GPU while loadi...
- 925 Views
- 1 replies
- 0 kudos
- 0 kudos
1. Are you using any of the model parallel library, such as FSDP or DeepSpeed? Otherwise, every GPU will load the entire model weights. 2. If yes in 1, Unity Catalog Volumes are exposed on every node at /Volumes/<catalog>/<schema>/<volume>/..., so w...
- 0 kudos
- 9921 Views
- 10 replies
- 7 kudos
Resolved! Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes
Hi Community,I’m currently working on a Retrieval-Augmented Generation (RAG) use case in Databricks. I’ve successfully implemented and served a model that uses a single Vector Search index, and everything works as expected.However, when I try to serv...
- 9921 Views
- 10 replies
- 7 kudos
- 16526 Views
- 4 replies
- 0 kudos
Resolved! LangGraph MemorySaver checkpointer usage with MLflow
Hi everyone.I am working on a graph that utilizes the MemorySaver class to incorporate short-term memory. This will enable me to maintain a multi-turn conversation with the user by storing the chat history.I am using the MLflow "models from code" fea...
- 16526 Views
- 4 replies
- 0 kudos
- 0 kudos
Hi @moemedina. No, I didn't.I'm considering using ChatModel/ChatAgent class to wrap the graph and be able to move on. However, the MLflow documentation is still referring to ChatModel where Chat Agent is the latest recommendation:MLflow ChatModel Doc...
- 0 kudos
-
agent
2 -
agent bricks
2 -
Agent Skills
1 -
agents
2 -
AI
2 -
AI Agents
10 -
ai gateway
2 -
Anthropic
1 -
API Documentation
1 -
App
3 -
Application
1 -
Asset Bundles
1 -
Authentication
1 -
Autologging
1 -
automoation
1 -
Aws databricks
2 -
ChatDatabricks
1 -
claude
5 -
Cluster
1 -
Credentials
1 -
crewai
1 -
cursor
1 -
Databricks App
3 -
Databricks Course
1 -
Databricks Delta Table
1 -
databricks genie
1 -
Databricks Mlflow
2 -
Databricks Notebooks
1 -
Databricks SQL
1 -
Databricks Table Usage
1 -
Databricks-connect
1 -
databricksapps
1 -
delta sync
1 -
Delta Tables
1 -
Developer Experience
1 -
DLT Pipeline
1 -
documentation
1 -
Ethical Data Governance
1 -
Foundation Model
4 -
gemini
1 -
gemma
1 -
GenAI
11 -
GenAI agent
2 -
GenAI and LLMs
5 -
GenAI Generation AI
1 -
GenAIGeneration AI
48 -
Generation AI
2 -
Generative AI
5 -
Genie
20 -
Genie - Notebook Access
2 -
Genie Code
3 -
GenieAPI
5 -
Google
1 -
GPT
1 -
healthcare
1 -
Index
1 -
inference table
1 -
Information Extraction
1 -
Langchain
4 -
LangGraph
1 -
Llama
1 -
Llama 3.3
1 -
LLM
2 -
machine-learning
1 -
mcp
3 -
Metric Views
1 -
MlFlow
4 -
Mlflow registry
1 -
MLFlow Tracking Server
1 -
MLModels
1 -
Model Serving
3 -
modelserving
1 -
mosic ai search
1 -
Multiagent
2 -
NPM error
1 -
OpenAI
1 -
Pandas udf
1 -
Playground
1 -
productivity
1 -
Pyspark
1 -
Pyspark Dataframes
1 -
RAG
3 -
ro
1 -
Scheduling
1 -
Server
1 -
serving endpoint
3 -
streaming
2 -
Tasks
1 -
Vector
1 -
vector index
1 -
Vector Search
2 -
Vector search index
6