- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Overview
Manufacturers price millions of parts across product families, geographies, sales units, and channels. These differences often require dozens of specialized ML models—each optimized for a particular part segment, sales region, or business unit. While this blog focuses on manufacturing as a case study, the dynamic model routing pattern applies broadly to any application that needs a single endpoint to select among multiple ML models based on request attributes. With a nod to Tim Peters’ Zen of Python—“explicit is better than implicit”— we’re going to take a moment to define the Dynamic Model Routing pattern, as there can be various viewpoints. Dynamic Model Routing Pattern is when a system explicitly chooses the best-suited model for each request at runtime based on context, cost, performance, or other domain-specific criteria. We show both the logical architecture and the implementation strategy for this design pattern on databricks.
In pricing for manufacturing, this matters because pricing models rarely evolve at the same pace: some part families demand frequent retraining due to volatile demand or competitive pressure, while others change slowly and stay stable for months. A modular architecture lets each model group evolve independently—retraining when needed, deploying new versions, and leaving the rest untouched—while applications enjoy a simple contract: one API where they submit part information and receive forecasted prices.
In other words, your architecture can be as complex as it needs to be behind the scenes, but your users still get to say: “Give me one API to submit part information and get back forecasted prices”—and never worry about which model did the work.
A Dynamic Modular architecture supports::
- One shared endpoint for all applications.
- Automatic model selection via routing logic.
- Real-time & batch inference with the same code.
- Autoscaling, concurrency, and lookup performance handled entirely by Databricks.
This blog walks through how to build a Dynamic Model Router on Databricks, powered by:
- Unity Catalog for governance of models, features, and routing configuration (to ensure isolation, lineage, auditability, and controlled promotions/rollbacks).
- Model Serving for scalable, low-latency inference.
- Feature Store + Online Feature Store for feature consistency and enrichment (with millisecond lookups needed in real-time parts-pricing scenarios).
- MLflow for versioning, lifecycle management, and version isolation for each model group.
1. Many Applications, Many Requests, Many Models — One Unified Endpoint
A global pricing ecosystem typically has:
- Multiple product families, each with different pricing dynamics.
- Different unit categories (retail, fleet, bulk, region-specific).
- Regional and channel-specific pricing rules.
- Frequent retraining, requiring version isolation for each model group.
Even with 20+ pricing models behind the scenes, the consuming applications expect:
- One request → One API.
- Flexible input payloads.
- Real-time responses.
- No knowledge of which underlying model generated the prediction.
Note on terminology: “pricing models” refers to ML models specialized for specific manufacturing slices (for example, segment × region × channel), each mapped 1:1 to a deployable model that can be versioned and promoted independently.
Example: Multi-Input Requests from Multiple Applications
Application 1
{
"part": "P-1001",
"product_segment": "SEG12",
"unit_category": "UNIT_A1",
"order": "O-1001",
"region": "US"
}
Application 2
{
"part": "P-1002",
"product_segment": "SEG19",
"unit_category": "UNIT_B2",
"order": "O-1005",
"region": "US"
}
Application 3
{
"part": "P-1006",
"product_segment": "SEG19",
"unit_category": "UNIT_E5",
"order": "O-1005",
"region": "EU"
}
2. End-to-End Architecture
Logical Architecture: Dynamic Model Router Responsibilities
- Enriches inputs with Feature Store lookup.
- Applies routing rules.
- Groups requests by model key.
- Calls the correct pricing model endpoint(s) in parallel; reassembles results in the original request order. This router endpoint provides the orchestration layer (enrich → route → batch → fan-out → reassemble) so applications always call one endpoint.
Core Databricks Components : Feature Store & Online Feature Store
- Store and serve features such as product_segment and unit_category; provide low-latency lookups for real-time serving; guarantee consistency between training and inference.
Pricing Models: Distinct logical components in the Dynamic Model Routing
- Each pricing model is a first-class component with a clear responsibility and lifecycle: it can be retrained, versioned, and promoted independently, and is deployed as its own autoscaled, Unity-Catalog–governed Model Serving endpoint. The serving layer then exposes all of these endpoints through a single pricing interface, so consuming applications have one unified API while many specialized models work behind the scenes.
Example: Response
[
{"part": "P-1001", "forecast_price": 17.42, "source_model": "R1_SEG12"},
{"part": "P-1006", "forecast_price": 23.10, "source_model": "R2_SEG12"}
]
3. Generic Routing Logic for a Pricing Use Case
To determine which pricing model should process a given request, the router relies on below features defined by business stakeholders:
- part — part number.
- product_segment — grouping related parts into families.
- unit_category — describes how parts are sold (retail, fleet, channel-specific, etc.).
- order — order number.
- region — US, CAN, EU, etc.
These are stored in the Feature Store and are accessible real-time via the Online Feature Store.
Example: Routing Table
| Model Key | product_segment | unit_category | part | order | region |
| R1_SEG12 | "SEG12, SEG19" | "UNIT_A1, UNIT_B2, UNIT_C3" | P-1001 | O-1001 | US |
| R2_SEG12 | "SEG12, SEG19" | "UNIT_D4, UNIT_E5, UNIT_F6" | P-1005 | O-1007 | EU |
Example: Routing Logic
IF product_segment IN ('SEG12', 'SEG19')
AND unit_category IN ('UNIT_A1', 'UNIT_B2', 'UNIT_C3')
AND part = 'P-1001'
AND order = 'O-1001'
AND region = 'US'
THEN use model R1_SEG12
Each routing key (e.g., R1_SEG12) maps to a model in Unity Catalog, such as:
- Model_uri -> models:/pricing_model_R1_SEG12/Production. In Databricks, with Unity Catalog, governance spans models, features, and routing configuration, providing lineage and permissions to enable safe, controlled changes.
4. How the Router Decides Which Model to Call
Example requests:
[
{"part": "P-1001", "product_segment": "SEG12", "unit_category": "UNIT_A1"},
{"part": "P-2002", "product_segment": "SEG12", "unit_category": "UNIT_D4"},
{"part": "P-3003", "product_segment": "SEG12", "unit_category": "UNIT_A1"}
]
Routing Result
| part | segment | category | model |
| P-1001 | SEG12 | UNIT_A1 | R1 |
| P-2002 | SEG12 | UNIT_D4 | R2 |
| P-3003 | SEG12 | UNIT_A1 | R1 |
Router Batches
- Batch R1 → P-1001, P-3003.
- Batch R2 → P-2002. Each batch is sent to the correct Model Serving endpoint.
5. Router Implementation Using MLflow & Model Serving
Here is the simplified router class:
class PricingRouter(mlflow.pyfunc.PythonModel):
def __init__(self, routing_config, feature_table):
self.routing_config = routing_config
self.feature_table = feature_table
def load_context(self, context):
self.fe = FeatureEngineeringClient()
self.ws = WorkspaceClient()
def _enrich_features(self, df):
lookup = self.fe.score_batch(
table_name=self.feature_table,
lookup_key="part_id",
df=df[["part_id"]]
)
return df.merge(lookup, on="part_id", how="left")
def _select_model_key(self, row):
for rule in self.routing_config:
if (row["product_segment"] in rule["product_segments"]
and row["unit_category"] in rule["unit_categories"]):
return rule["model_key"]
return "default"
def _call_endpoint(self, endpoint_name, payload):
resp = self.ws.serving_endpoints.query(
name=endpoint_name,
dataframe_records=payload
)
return pd.DataFrame(resp.predictions)
def predict(self, context, model_input):
df = self._enrich_features(model_input.copy())
df["__row_id__"] = range(len(df))
df["routing_key"] = df.apply(self._select_model_key, axis=1)
outputs = []
for key, group in df.groupby("routing_key"):
endpoint = next(rule["endpoint_name"]
for rule in self.routing_config
if rule["model_key"] == key)
payload = group.to_dict("records")
preds = self._call_endpoint(endpoint, payload)
preds["__row_id__"] = group["__row_id__"].values
preds["source_model"] = key
outputs.append(preds)
return (pd.concat(outputs)
.sort_values("__row_id__")
.drop(columns=["__row_id__"]))
Deployment with MLflow
router = PricingRouter(
routing_config=my_routing_rules,
feature_table="main.supply_chain.part_features"
)
with mlflow.start_run():
mlflow.pyfunc.log_model(
"pricing_router",
python_model=router,
input_example=pd.DataFrame({"part_id": ["P-1001"]})
)
# Deploy to Databricks Model Serving.
Databricks automatically handles autoscaling, concurrency, endpoint isolation, failover, versioning, and security.
6. Batch Forecasting Using Spark
Use the same router logic for batch scoring:
router_udf = mlflow.pyfunc.spark_udf(
spark,
model_uri="models:/pricing_router/Production",
result_type="struct<forecast_price:double, source_model:string>"
)
parts_df = spark.table("main.supply_chain.open_quotes")
results = parts_df.withColumn(
"prediction",
router_udf(F.struct("part_id"))
)
Batch and real-time remain aligned; one artifact, two modes.
7. Why Databricks is a Strong Fit
- Fully Managed Model Serving - No Kubernetes configuration; no API gateways; built-in autoscaling; high concurrency + high throughput.
- Online Feature Store Integration - Millisecond-level feature lookup; same features in training + inference to reduce drift.
- Unity Catalog Governance - One security model; lineage across models, features, and tables; auditing and access control; governed routing config for safe changes.
- MLflow for Model Lifecycle - Versioning; reproducible deployments; multi-model management at scale.
- Model Isolation + Modular Retraining - Each model retrained & deployed without touching other model groups; refresh low-quality or drifting models independently; router automatically routes to latest promoted versions.
- Multi-Model Architecture, One Endpoint - Router orchestrates all routing; new model = just a config update; no app changes.
- Real-Time + Batch in Harmony - Same logic everywhere; zero code duplication.
8. Closing Thoughts
Pricing across a global supply chain requires multiple ML models—but applications shouldn't inherit that complexity. A Dynamic Model Router on Databricks routes many models behind a single endpoint, enabling a clean API for all apps, governed model management, consistent features, low-latency real-time pricing, and scalable batch processing. Databricks handles the infrastructure, letting teams focus on pricing intelligence—not on operating the system.