cancel
Showing results for 
Search instead for 
Did you mean: 
Generative AI
Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.
cancel
Showing results for 
Search instead for 
Did you mean: 

Understanding compute requirements for Deploying Deepseek-R1-Distilled-Llama Models on databricks

kbmv
Contributor

Hi I came across the blog Deploying Deepseek-R1-Distilled-Llama Models on Databricks at https://www.databricks.com/blog/deepseek-r1-databricks

I am new to using custom models that are not available as part of foundation models.

According to the blog, I need to download a Deepseek distilled model from huggingface to my volume. Register it on my MLFlow and serve as Provisioned throughput. Can someone help me with following questions.

  1. If I want to download the 70B model, the recommended compute is g6e.4xlarge, which has 128GB CPU memory and 48GB GPU memory. To clarify, do I need this specific compute only for MLflow registration of the model?

    Additionally, the blog states:
    "You don’t need GPUs per se to deploy the model within the notebook, as long as the compute has sufficient memory capacity."

    Does this refer to serving the model only? Or can I complete both MLFlow registration and deployment as serving using a compute instance with 128GB CPU memory and no GPU?

  2. For provisioned throughput of the model, when I select my registered model for serving. What will be my pricing on usage per hour? Will deepseek-r1-distilled-llama-70b pricing be same as llama 3.3 70B, and deepseek-r1-distilled-llama-8b be same as llama 3.1B as mentioned in following link or the pricing will be different? https://www.databricks.com/product/pricing/foundation-model-serving
  3. For custom rag chains or agent models, I have seen option to select Compute type as CPU, GPU small etc. Will it be such a case for my distilled model or as per point 2, if so what would be the recommendation for 70b and 8b variations. Attaching a screenshot .
    kbmv_0-1738850768911.png

     

    Thanks

    Posted on wrong board wasn't able to move or delete so recreated same question here.

1 ACCEPTED SOLUTION

Accepted Solutions

kbmv
Contributor
1 REPLY 1

kbmv
Contributor

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group