โ02-06-2025 05:11 AM
Hi I have read the blog Deploying Deepseek-R1-Distilled-Llama Models on Databricks at https://www.databricks.com/blog/deepseek-r1-databricks
I am new to using custom models that are not available as part of foundation models.
According to the blog, I need to download a Deepseek distilled model from huggingface to my volume. Register it on my MLFlow and serve as Provisioned throughput. Can someone help me with following questions.
If I want to download the 70B model, the recommended compute is g6e.4xlarge, which has 128GB CPU memory and 48GB GPU memory. To clarify, do I need this specific compute only for MLflow registration of the model?
Additionally, the blog states:
"You donโt need GPUs per se to deploy the model within the notebook, as long as the compute has sufficient memory capacity."
Does this refer to serving the model only? Or can I complete both MLFlow registration and deployment as serving using a compute instance with 128GB CPU memory and no GPU?
Thanks
โ02-07-2025 12:10 AM
Hi @kbmv ,
Based on my experience deploying Deepseek-R1-Distilled-Llama on Databricks, here are my answers to your questions:
โ02-07-2025 12:10 AM
Hi @kbmv ,
Based on my experience deploying Deepseek-R1-Distilled-Llama on Databricks, here are my answers to your questions:
โ02-07-2025 01:23 AM
Thanks @Isi, For in-detail explanation. Things are clear now.
Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now