- 477 Views
- 1 replies
- 0 kudos
How to utilize clustered gpu for large hf models
Hi,I am using clustered GPU(driver -1GPU and Worker-3GPU), and caching model data into unity catalog but while loading model checkpoint shards its always use driver memory and failed due insufficient memory.How to use complete cluster GPU while loadi...
- 477 Views
- 1 replies
- 0 kudos
- 0 kudos
1. Are you using any of the model parallel library, such as FSDP or DeepSpeed? Otherwise, every GPU will load the entire model weights. 2. If yes in 1, Unity Catalog Volumes are exposed on every node at /Volumes/<catalog>/<schema>/<volume>/..., so w...
- 0 kudos
- 5284 Views
- 10 replies
- 7 kudos
Resolved! Insufficient Permission Error When Serving RAG Model with Multiple Vector Search Indexes
Hi Community,I’m currently working on a Retrieval-Augmented Generation (RAG) use case in Databricks. I’ve successfully implemented and served a model that uses a single Vector Search index, and everything works as expected.However, when I try to serv...
- 5284 Views
- 10 replies
- 7 kudos
- 6086 Views
- 4 replies
- 0 kudos
Resolved! LangGraph MemorySaver checkpointer usage with MLflow
Hi everyone.I am working on a graph that utilizes the MemorySaver class to incorporate short-term memory. This will enable me to maintain a multi-turn conversation with the user by storing the chat history.I am using the MLflow "models from code" fea...
- 6086 Views
- 4 replies
- 0 kudos
- 0 kudos
Hi @moemedina. No, I didn't.I'm considering using ChatModel/ChatAgent class to wrap the graph and be able to move on. However, the MLflow documentation is still referring to ChatModel where Chat Agent is the latest recommendation:MLflow ChatModel Doc...
- 0 kudos
-
agent
1 -
agents
2 -
AI
1 -
AI Agents
3 -
ai gateway
1 -
API Documentation
1 -
App
1 -
Application
1 -
automoation
1 -
Aws databricks
2 -
ChatDatabricks
1 -
Chatgpt
1 -
claude
2 -
Cluster
1 -
Credentials
1 -
cursor
1 -
Databricks App
1 -
Databricks Course
1 -
Databricks Delta Table
1 -
Databricks Notebooks
1 -
Databricks SQL
1 -
Databricks-connect
1 -
delta sync
1 -
Delta Tables
1 -
Developer Experience
1 -
DLT Pipeline
1 -
Foundation Model
3 -
gemma
1 -
GenAI
6 -
GenAI agent
2 -
GenAI and LLMs
3 -
GenAI Generation AI
1 -
GenAIGeneration AI
13 -
Generation AI
2 -
Generative AI
4 -
Genie
13 -
Genie - Notebook Access
2 -
GenieAPI
2 -
Index
1 -
inference table
1 -
Langchain
4 -
LangGraph
1 -
Llama
1 -
Llama 3.3
1 -
LLM
2 -
machine-learning
1 -
MlFlow
4 -
Mlflow registry
1 -
MLModels
1 -
Model Serving
1 -
mosic ai search
1 -
Multiagent
1 -
NPM error
1 -
Pandas udf
1 -
RAG
2 -
ro
1 -
Scheduling
1 -
Server
1 -
serving endpoint
1 -
streaming
1 -
Tasks
1 -
Vector
1 -
vector index
1 -
Vector Search
2 -
Vector search index
6