Databricks Community

sanjay · ‎07-28-2024

Hi,

I am deploying MLflow models using Databrick serverless serving but seems servers scale down to 0 only after 30 minute of inactivity. Is there any way to reduce this time?

Also, Is it possible to deploy multiple models under single endpoint. I want to run multiple models in one endpoint to reduce cost like AWS sage maker provides multi-model deployment functionality.

Appreciate any help.

Regards,
Sanjay

Walter_C · ‎07-29-2024

Regarding your first question about reducing the scale-down time of Databricks serverless serving, currently, the system is designed to scale down to zero after 30 minutes of inactivity. This is to ensure that instances are kept warm to handle any sudden increase in traffic and to reduce costs for non-24/7 traffic or development environments. Unfortunately, there is no direct way to reduce this time as it is a part of the auto-scaling algorithm used by Databricks.

For your second question about deploying multiple models under a single endpoint, yes, Databricks does support this functionality. You can serve multiple models to a CPU serving endpoint that utilizes Databricks Model Serving. An endpoint can serve any registered Python MLflow model registered in the Model Registry. You can create a single endpoint with multiple models and set the endpoint traffic split between those models. For example, you can have one model (let's call it "current") that gets 90% of the endpoint traffic, while another model (let's call it "challenger") gets the remaining 10% of the traffic. You can also update the traffic split between served models as needed.

https://docs.databricks.com/en/machine-learning/model-serving/serve-multiple-models-to-serving-endpo...

Databricks Community

how to reduce scale to zero time in MLFlow Serving

Photos

Join Us as a Local Community Builder!

Virtual Learning Festival: 9 April - 30 April

Intelligent Data Warehousing: AI/BI for Self-service Analytics

Get Started With Lakehouse Architecture | Pass a quiz to earn your certificate completion.

Data + AI Summit 2025 — registration now open!

Databricks Community Champion - March 2025 - Takuya Omi