Endpoint deployment is very slow

gbhatia
New Contributor II

HI team 

I am testing some changes on UAT / DEV environment and noticed that the model endpoint are very slow to deploy. Since the environment is just testing and not serving any production traffic, I was wondering if there was a way to expedite this process? I dont need the most stable / secure roll-over in this scenario. 

 

Thanks

Gurpreet

WiliamRosa
Databricks Partner

Hi @gbhatia,

I’d need a few more details to fully understand your deployment, but in general, what can help is setting Compute type: CPU (cheaper and sufficient for testing), Compute scale-out: Small (0–4 concurrency, 0–4 DBU) since you don’t need high concurrency in DEV/UAT, and keeping Scale to zero disabled to avoid cold starts and have the endpoint always ready — noting that this increases costs slightly but makes testing much faster; for production, the recommended practice is to use larger instance sizes, more replicas, and only enable scale to zero for truly intermittent workloads.
https://docs.databricks.com/aws/en/machine-learning/model-serving/create-manage-serving-endpoints

WiliamRosa_0-1757995272360.png

 




Wiliam Rosa
Data Engineer | Machine Learning Engineer
LinkedIn: linkedin.com/in/wiliamrosa

Hi @gbhatia,
Let us know if the solution worked for you.

Wiliam Rosa
Data Engineer | Machine Learning Engineer
LinkedIn: linkedin.com/in/wiliamrosa

gbhatia
New Contributor II

Hi @WiliamRosa 

Thanks for your response on this. I have been using the setting you described aboved, with the exception of `scale_to_zero`. PFA screenshot of the endpoint settings. 
My deployment is a simple Pytorch Deep Learning model wrapped in a `sklearn` Pipeline wrapper. So, something like this:
 ```
Class OnlineModel(mlflow.pyfunc.PythonModel):
def __init__(self, deep_learning_model):
self.model = deep_learning_model # pytorch model
```

 

Screenshot 2025-09-22 at 10.50.52 AM.png




Every small changes to the model takes 20-30 mins of updating the endpoint which causes a significant delay in my testing & development. Wondering if I can decrease this wait time or this is how much databricks endpoint update take and not much i can do with it. 

Gurpreet