cancel
Showing results for 
Search instead for 
Did you mean: 
Generative AI
Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

satusky
by New Contributor II
  • 573 Views
  • 2 replies
  • 1 kudos

Resolved! Hosted gpt-oss endpoint system prompts contain "You have no access to tools"

We are using the foundation model endpoints (provisioned and pay-per-token) for the gpt-oss models for a research project. We have been experiencing consistent tool call failures: gpt-oss-20b was failing ~1 out of every 12, while gpt-oss-120b was fai...

  • 573 Views
  • 2 replies
  • 1 kudos
Latest Reply
Ashwin_DSA
Databricks Employee
  • 1 kudos

Hi @satusky, I'm not an expert in this area, but after some internal research, I wouldn't treat a leaked system-prompt snippet as a product issue related to tool use. I can’t comment on the exact internal prompt template for a specific invocation, bu...

  • 1 kudos
1 More Replies
Mariano-Vertiz
by New Contributor II
  • 2110 Views
  • 5 replies
  • 1 kudos

Resolved! Problems with unstructured_data_pipeline

Hi everyone,I'm currently working with the unstructured data pipeline in Databricks, using the official notebook provided by Databricks without any modifications. Strangely, despite being an out-of-the-box resource, the notebook fails during executio...

  • 2110 Views
  • 5 replies
  • 1 kudos
Latest Reply
dkushari
Databricks Employee
  • 1 kudos

Hi @Mariano-Vertiz - Which access mode are you using for your cluster - dedicated or standard? I think it is failing as a standard cluster does not allow the low-level operation it is trying to perform in cell 42. Is that where it's failing? I tried ...

  • 1 kudos
4 More Replies
dk_g
by New Contributor
  • 925 Views
  • 1 replies
  • 0 kudos

How to utilize clustered gpu for large hf models

Hi,I am using clustered GPU(driver -1GPU and Worker-3GPU), and caching model data into unity catalog but while loading model checkpoint shards its always use driver memory and failed due insufficient memory.How to use complete cluster GPU while loadi...

  • 925 Views
  • 1 replies
  • 0 kudos
Latest Reply
lin-yuan
Databricks Employee
  • 0 kudos

1. Are you using any of the model parallel library, such as FSDP or DeepSpeed? Otherwise, every GPU will load the entire model weights.  2. If yes in 1, Unity Catalog Volumes are exposed on every node at /Volumes/<catalog>/<schema>/<volume>/..., so w...

  • 0 kudos
Karthik_Karanm
by New Contributor III
  • 9921 Views
  • 10 replies
  • 7 kudos

Resolved! Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes

Hi Community,I’m currently working on a Retrieval-Augmented Generation (RAG) use case in Databricks. I’ve successfully implemented and served a model that uses a single Vector Search index, and everything works as expected.However, when I try to serv...

  • 9921 Views
  • 10 replies
  • 7 kudos
Latest Reply
lingareddy_Alva
Esteemed Contributor
  • 7 kudos

Thank you

  • 7 kudos
9 More Replies
sebascardonal
by Databricks Partner
  • 16526 Views
  • 4 replies
  • 0 kudos

Resolved! LangGraph MemorySaver checkpointer usage with MLflow

Hi everyone.I am working on a graph that utilizes the MemorySaver class to incorporate short-term memory. This will enable me to maintain a multi-turn conversation with the user by storing the chat history.I am using the MLflow "models from code" fea...

  • 16526 Views
  • 4 replies
  • 0 kudos
Latest Reply
sebascardonal
Databricks Partner
  • 0 kudos

Hi @moemedina. No, I didn't.I'm considering using ChatModel/ChatAgent class to wrap the graph and be able to move on. However, the MLflow documentation is still referring to ChatModel where Chat Agent is the latest recommendation:MLflow ChatModel Doc...

  • 0 kudos
3 More Replies