<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Model Serving - Shadow Deployment - Azure in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/model-serving-shadow-deployment-azure/m-p/138778#M4430</link>
    <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/30358"&gt;@ryojikn&lt;/a&gt;&amp;nbsp;and &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/197429"&gt;@irtizak&lt;/a&gt;&amp;nbsp;, you’re right. Databricks Model Serving allows splitting traffic between model versions, but it doesn’t have a true shadow deployment where live production traffic is mirrored to a new model for monitoring without affecting user responses.&lt;/P&gt;
&lt;P&gt;For now, you can try a couple of custom approaches:&lt;/P&gt;
&lt;P&gt;1) Deploy one endpoint with your production model and another with the shadow model. On the client side, duplicate each incoming request to both endpoints, but return only the production model’s response to the user. You can capture and compare both responses later using the inference table for analysis.&lt;/P&gt;
&lt;P&gt;2) Wrap your models inside a PyFunc and handle routing within the wrapper itself. You can reference models dynamically using aliases (like champion and challenger) so that whenever a model version changes, you don’t need to update the wrapper code. It’ll automatically select the correct model version based on the alias when the endpoint is updated.&lt;/P&gt;</description>
    <pubDate>Wed, 12 Nov 2025 13:18:59 GMT</pubDate>
    <dc:creator>KaushalVachhani</dc:creator>
    <dc:date>2025-11-12T13:18:59Z</dc:date>
    <item>
      <title>Model Serving - Shadow Deployment - Azure</title>
      <link>https://community.databricks.com/t5/machine-learning/model-serving-shadow-deployment-azure/m-p/68066#M3242</link>
      <description>&lt;P&gt;Hey,&lt;/P&gt;&lt;P&gt;I'm composing an architecture within the usage of Model Serving Endpoints and one of the needs that we're aiming to resolve is Shadow Deployment.&lt;/P&gt;&lt;P&gt;Currently, it seems that the traffic configurations available in model serving do not allow this type of behavior, mixing a mirroring requests effect with "fire and forget" responses from the shadow application.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Do you have this as a feature backlog? Or do you have any already implemented architecture composed within Azure pieces that I could use for that?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance&lt;/P&gt;</description>
      <pubDate>Fri, 03 May 2024 15:59:23 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/model-serving-shadow-deployment-azure/m-p/68066#M3242</guid>
      <dc:creator>ryojikn</dc:creator>
      <dc:date>2024-05-03T15:59:23Z</dc:date>
    </item>
    <item>
      <title>Re: Model Serving - Shadow Deployment - Azure</title>
      <link>https://community.databricks.com/t5/machine-learning/model-serving-shadow-deployment-azure/m-p/138661#M4428</link>
      <description>&lt;P&gt;I have the same query.&lt;/P&gt;</description>
      <pubDate>Tue, 11 Nov 2025 21:51:44 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/model-serving-shadow-deployment-azure/m-p/138661#M4428</guid>
      <dc:creator>irtizak</dc:creator>
      <dc:date>2025-11-11T21:51:44Z</dc:date>
    </item>
    <item>
      <title>Re: Model Serving - Shadow Deployment - Azure</title>
      <link>https://community.databricks.com/t5/machine-learning/model-serving-shadow-deployment-azure/m-p/138778#M4430</link>
      <description>&lt;P&gt;&lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/30358"&gt;@ryojikn&lt;/a&gt;&amp;nbsp;and &lt;a href="https://community.databricks.com/t5/user/viewprofilepage/user-id/197429"&gt;@irtizak&lt;/a&gt;&amp;nbsp;, you’re right. Databricks Model Serving allows splitting traffic between model versions, but it doesn’t have a true shadow deployment where live production traffic is mirrored to a new model for monitoring without affecting user responses.&lt;/P&gt;
&lt;P&gt;For now, you can try a couple of custom approaches:&lt;/P&gt;
&lt;P&gt;1) Deploy one endpoint with your production model and another with the shadow model. On the client side, duplicate each incoming request to both endpoints, but return only the production model’s response to the user. You can capture and compare both responses later using the inference table for analysis.&lt;/P&gt;
&lt;P&gt;2) Wrap your models inside a PyFunc and handle routing within the wrapper itself. You can reference models dynamically using aliases (like champion and challenger) so that whenever a model version changes, you don’t need to update the wrapper code. It’ll automatically select the correct model version based on the alias when the endpoint is updated.&lt;/P&gt;</description>
      <pubDate>Wed, 12 Nov 2025 13:18:59 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/model-serving-shadow-deployment-azure/m-p/138778#M4430</guid>
      <dc:creator>KaushalVachhani</dc:creator>
      <dc:date>2025-11-12T13:18:59Z</dc:date>
    </item>
    <item>
      <title>Re: Model Serving - Shadow Deployment - Azure</title>
      <link>https://community.databricks.com/t5/machine-learning/model-serving-shadow-deployment-azure/m-p/143507#M4526</link>
      <description>&lt;P&gt;Out of curiosity, why the traffic is restricted to 100%? Won't be more flexible to remove this restriction?&lt;/P&gt;</description>
      <pubDate>Fri, 09 Jan 2026 15:43:51 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/model-serving-shadow-deployment-azure/m-p/143507#M4526</guid>
      <dc:creator>Davidzuma</dc:creator>
      <dc:date>2026-01-09T15:43:51Z</dc:date>
    </item>
  </channel>
</rss>

