cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Endpoint deployment is very slow

gbhatia
New Contributor II

HI team 

I am testing some changes on UAT / DEV environment and noticed that the model endpoint are very slow to deploy. Since the environment is just testing and not serving any production traffic, I was wondering if there was a way to expedite this process? I dont need the most stable / secure roll-over in this scenario. 

 

Thanks

Gurpreet

3 REPLIES 3

WiliamRosa
New Contributor III

Hi @gbhatia,

Iโ€™d need a few more details to fully understand your deployment, but in general, what can help is setting Compute type: CPU (cheaper and sufficient for testing), Compute scale-out: Small (0โ€“4 concurrency, 0โ€“4 DBU) since you donโ€™t need high concurrency in DEV/UAT, and keeping Scale to zero disabled to avoid cold starts and have the endpoint always ready โ€” noting that this increases costs slightly but makes testing much faster; for production, the recommended practice is to use larger instance sizes, more replicas, and only enable scale to zero for truly intermittent workloads.
https://docs.databricks.com/aws/en/machine-learning/model-serving/create-manage-serving-endpoints

WiliamRosa_0-1757995272360.png

 




Wiliam Rosa
Data Engineer | Machine Learning Engineer
LinkedIn: linkedin.com/in/wiliamrosa

Hi @gbhatia,
Let us know if the solution worked for you.

Wiliam Rosa
Data Engineer | Machine Learning Engineer
LinkedIn: linkedin.com/in/wiliamrosa

gbhatia
New Contributor II

Hi @WiliamRosa 

Thanks for your response on this. I have been using the setting you described aboved, with the exception of `scale_to_zero`. PFA screenshot of the endpoint settings. 
My deployment is a simple Pytorch Deep Learning model wrapped in a `sklearn` Pipeline wrapper. So, something like this:
 ```
Class OnlineModel(mlflow.pyfunc.PythonModel):
def __init__(self, deep_learning_model):
self.model = deep_learning_model # pytorch model
```

 

Screenshot 2025-09-22 at 10.50.52โ€ฏAM.png

โ€ƒ




Every small changes to the model takes 20-30 mins of updating the endpoint which causes a significant delay in my testing & development. Wondering if I can decrease this wait time or this is how much databricks endpoint update take and not much i can do with it. 

Gurpreet

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local communityโ€”sign up today to get started!

Sign Up Now