Databricks Community

sanjay · ‎07-28-2024

Hi,

I am deploying MLflow models using Databrick serverless serving but seems servers scale down to 0 only after 30 minute of inactivity. Is there any way to reduce this time?

Also, Is it possible to deploy multiple models under single endpoint. I want to run multiple models in one endpoint to reduce cost like AWS sage maker provides multi-model deployment functionality.

Appreciate any help.

Regards,
Sanjay

Walter_C · ‎07-29-2024

Regarding your first question about reducing the scale-down time of Databricks serverless serving, currently, the system is designed to scale down to zero after 30 minutes of inactivity. This is to ensure that instances are kept warm to handle any sudden increase in traffic and to reduce costs for non-24/7 traffic or development environments. Unfortunately, there is no direct way to reduce this time as it is a part of the auto-scaling algorithm used by Databricks.

For your second question about deploying multiple models under a single endpoint, yes, Databricks does support this functionality. You can serve multiple models to a CPU serving endpoint that utilizes Databricks Model Serving. An endpoint can serve any registered Python MLflow model registered in the Model Registry. You can create a single endpoint with multiple models and set the endpoint traffic split between those models. For example, you can have one model (let's call it "current") that gets 90% of the endpoint traffic, while another model (let's call it "challenger") gets the remaining 10% of the traffic. You can also update the traffic split between served models as needed.

https://docs.databricks.com/en/machine-learning/model-serving/serve-multiple-models-to-serving-endpo...

Databricks Community

how to reduce scale to zero time in MLFlow Serving

li.media.uploader-dialog.title

Join Us as a Local Community Builder!

Announcing the APJ Databricks Smart Business Insights Challenge: Empowering Data-Driven Decision Mak

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Business Intelligence in the Era of AI

Virtual Learning Festival: 9 April - 30 April

Data + AI Summit 2025 — registration now open!