topic Re: Model Serving - Shadow Deployment - Azure in Machine Learning

Model Serving - Shadow Deployment - Azure

ryojikn — Fri, 03 May 2024 15:59:23 GMT

Hey,

I'm composing an architecture within the usage of Model Serving Endpoints and one of the needs that we're aiming to resolve is Shadow Deployment.

Currently, it seems that the traffic configurations available in model serving do not allow this type of behavior, mixing a mirroring requests effect with "fire and forget" responses from the shadow application.

Do you have this as a feature backlog? Or do you have any already implemented architecture composed within Azure pieces that I could use for that?

Thanks in advance

Re: Model Serving - Shadow Deployment - Azure

irtizak — Tue, 11 Nov 2025 21:51:44 GMT

I have the same query.

Re: Model Serving - Shadow Deployment - Azure

KaushalVachhani — Wed, 12 Nov 2025 13:18:59 GMT

@ryojikn and @irtizak , you’re right. Databricks Model Serving allows splitting traffic between model versions, but it doesn’t have a true shadow deployment where live production traffic is mirrored to a new model for monitoring without affecting user responses.

For now, you can try a couple of custom approaches:

1) Deploy one endpoint with your production model and another with the shadow model. On the client side, duplicate each incoming request to both endpoints, but return only the production model’s response to the user. You can capture and compare both responses later using the inference table for analysis.

2) Wrap your models inside a PyFunc and handle routing within the wrapper itself. You can reference models dynamically using aliases (like champion and challenger) so that whenever a model version changes, you don’t need to update the wrapper code. It’ll automatically select the correct model version based on the alias when the endpoint is updated.

Re: Model Serving - Shadow Deployment - Azure

Davidzuma — Fri, 09 Jan 2026 15:43:51 GMT

Out of curiosity, why the traffic is restricted to 100%? Won't be more flexible to remove this restriction?