Hi @WiliamRosa
Thanks for your response on this. I have been using the setting you described aboved, with the exception of `scale_to_zero`. PFA screenshot of the endpoint settings.
My deployment is a simple Pytorch Deep Learning model wrapped in a `sklearn` Pipeline wrapper. So, something like this:
```
Class OnlineModel(mlflow.pyfunc.PythonModel):
def __init__(self, deep_learning_model):
self.model = deep_learning_model # pytorch model
```

โ
Every small changes to the model takes 20-30 mins of updating the endpoint which causes a significant delay in my testing & development. Wondering if I can decrease this wait time or this is how much databricks endpoint update take and not much i can do with it.
Gurpreet