cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Generative AI
Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Understanding compute requirements for Deploying Deepseek-R1-Distilled-Llama Models on databricks

kbmv
Contributor

Hi I came across the blog Deploying Deepseek-R1-Distilled-Llama Models on Databricks at https://www.databricks.com/blog/deepseek-r1-databricks

I am new to using custom models that are not available as part of foundation models.

According to the blog, I need to download a Deepseek distilled model from huggingface to my volume. Register it on my MLFlow and serve as Provisioned throughput. Can someone help me with following questions.

  1. If I want to download the 70B model, the recommended compute is g6e.4xlarge, which has 128GB CPU memory and 48GB GPU memory. To clarify, do I need this specific compute only for MLflow registration of the model?

    Additionally, the blog states:
    "You donโ€™t need GPUs per se to deploy the model within the notebook, as long as the compute has sufficient memory capacity."

    Does this refer to serving the model only? Or can I complete both MLFlow registration and deployment as serving using a compute instance with 128GB CPU memory and no GPU?

  2. For provisioned throughput of the model, when I select my registered model for serving. What will be my pricing on usage per hour? Will deepseek-r1-distilled-llama-70b pricing be same as llama 3.3 70B, and deepseek-r1-distilled-llama-8b be same as llama 3.1B as mentioned in following link or the pricing will be different? https://www.databricks.com/product/pricing/foundation-model-serving
  3. For custom rag chains or agent models, I have seen option to select Compute type as CPU, GPU small etc. Will it be such a case for my distilled model or as per point 2, if so what would be the recommendation for 70b and 8b variations. Attaching a screenshot .
    kbmv_0-1738850768911.png

     

    Thanks

    Posted on wrong board wasn't able to move or delete so recreated same question here.

1 ACCEPTED SOLUTION

Accepted Solutions

kbmv
Contributor
1 REPLY 1

kbmv
Contributor

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group