Manufacturers price millions of parts across product families, geographies, sales units, and channels. These differences often require dozens of specialized ML models—each optimized for a particular part segment, sales region, or business unit. While this blog focuses on manufacturing as a case study, the dynamic model routing pattern applies broadly to any application that needs a single endpoint to select among multiple ML models based on request attributes. With a nod to Tim Peters’ Zen of Python—“explicit is better than implicit”— we’re going to take a moment to define the Dynamic Model Routing pattern, as there can be various viewpoints. Dynamic Model Routing Pattern is when a system explicitly chooses the best-suited model for each request at runtime based on context, cost, performance, or other domain-specific criteria. We show both the logical architecture and the implementation strategy for this design pattern on databricks.
In pricing for manufacturing, this matters because pricing models rarely evolve at the same pace: some part families demand frequent retraining due to volatile demand or competitive pressure, while others change slowly and stay stable for months. A modular architecture lets each model group evolve independently—retraining when needed, deploying new versions, and leaving the rest untouched—while applications enjoy a simple contract: one API where they submit part information and receive forecasted prices.
In other words, your architecture can be as complex as it needs to be behind the scenes, but your users still get to say: “Give me one API to submit part information and get back forecasted prices”—and never worry about which model did the work.
A Dynamic Modular architecture supports::
This blog walks through how to build a Dynamic Model Router on Databricks, powered by:
A global pricing ecosystem typically has:
Even with 20+ pricing models behind the scenes, the consuming applications expect:
Note on terminology: “pricing models” refers to ML models specialized for specific manufacturing slices (for example, segment × region × channel), each mapped 1:1 to a deployable model that can be versioned and promoted independently.
Application 1
{
"part": "P-1001",
"product_segment": "SEG12",
"unit_category": "UNIT_A1",
"order": "O-1001",
"region": "US"
}
Application 2
{
"part": "P-1002",
"product_segment": "SEG19",
"unit_category": "UNIT_B2",
"order": "O-1005",
"region": "US"
}
Application 3
{
"part": "P-1006",
"product_segment": "SEG19",
"unit_category": "UNIT_E5",
"order": "O-1005",
"region": "EU"
}
Logical Architecture: Dynamic Model Router Responsibilities
Core Databricks Components : Feature Store & Online Feature Store
Pricing Models: Distinct logical components in the Dynamic Model Routing
[
{"part": "P-1001", "forecast_price": 17.42, "source_model": "R1_SEG12"},
{"part": "P-1006", "forecast_price": 23.10, "source_model": "R2_SEG12"}
]
To determine which pricing model should process a given request, the router relies on below features defined by business stakeholders:
These are stored in the Feature Store and are accessible real-time via the Online Feature Store.
| Model Key | product_segment | unit_category | part | order | region |
| R1_SEG12 | "SEG12, SEG19" | "UNIT_A1, UNIT_B2, UNIT_C3" | P-1001 | O-1001 | US |
| R2_SEG12 | "SEG12, SEG19" | "UNIT_D4, UNIT_E5, UNIT_F6" | P-1005 | O-1007 | EU |
IF product_segment IN ('SEG12', 'SEG19')
AND unit_category IN ('UNIT_A1', 'UNIT_B2', 'UNIT_C3')
AND part = 'P-1001'
AND order = 'O-1001'
AND region = 'US'
THEN use model R1_SEG12
Each routing key (e.g., R1_SEG12) maps to a model in Unity Catalog, such as:
Example requests:
[
{"part": "P-1001", "product_segment": "SEG12", "unit_category": "UNIT_A1"},
{"part": "P-2002", "product_segment": "SEG12", "unit_category": "UNIT_D4"},
{"part": "P-3003", "product_segment": "SEG12", "unit_category": "UNIT_A1"}
]
Routing Result
| part | segment | category | model |
| P-1001 | SEG12 | UNIT_A1 | R1 |
| P-2002 | SEG12 | UNIT_D4 | R2 |
| P-3003 | SEG12 | UNIT_A1 | R1 |
Router Batches
Here is the simplified router class:
class PricingRouter(mlflow.pyfunc.PythonModel):
def __init__(self, routing_config, feature_table):
self.routing_config = routing_config
self.feature_table = feature_table
def load_context(self, context):
self.fe = FeatureEngineeringClient()
self.ws = WorkspaceClient()
def _enrich_features(self, df):
lookup = self.fe.score_batch(
table_name=self.feature_table,
lookup_key="part_id",
df=df[["part_id"]]
)
return df.merge(lookup, on="part_id", how="left")
def _select_model_key(self, row):
for rule in self.routing_config:
if (row["product_segment"] in rule["product_segments"]
and row["unit_category"] in rule["unit_categories"]):
return rule["model_key"]
return "default"
def _call_endpoint(self, endpoint_name, payload):
resp = self.ws.serving_endpoints.query(
name=endpoint_name,
dataframe_records=payload
)
return pd.DataFrame(resp.predictions)
def predict(self, context, model_input):
df = self._enrich_features(model_input.copy())
df["__row_id__"] = range(len(df))
df["routing_key"] = df.apply(self._select_model_key, axis=1)
outputs = []
for key, group in df.groupby("routing_key"):
endpoint = next(rule["endpoint_name"]
for rule in self.routing_config
if rule["model_key"] == key)
payload = group.to_dict("records")
preds = self._call_endpoint(endpoint, payload)
preds["__row_id__"] = group["__row_id__"].values
preds["source_model"] = key
outputs.append(preds)
return (pd.concat(outputs)
.sort_values("__row_id__")
.drop(columns=["__row_id__"]))
Deployment with MLflow
router = PricingRouter(
routing_config=my_routing_rules,
feature_table="main.supply_chain.part_features"
)
with mlflow.start_run():
mlflow.pyfunc.log_model(
"pricing_router",
python_model=router,
input_example=pd.DataFrame({"part_id": ["P-1001"]})
)
# Deploy to Databricks Model Serving.
Databricks automatically handles autoscaling, concurrency, endpoint isolation, failover, versioning, and security.
Use the same router logic for batch scoring:
router_udf = mlflow.pyfunc.spark_udf(
spark,
model_uri="models:/pricing_router/Production",
result_type="struct<forecast_price:double, source_model:string>"
)
parts_df = spark.table("main.supply_chain.open_quotes")
results = parts_df.withColumn(
"prediction",
router_udf(F.struct("part_id"))
)
Batch and real-time remain aligned; one artifact, two modes.
Pricing across a global supply chain requires multiple ML models—but applications shouldn't inherit that complexity. A Dynamic Model Router on Databricks routes many models behind a single endpoint, enabling a clean API for all apps, governed model management, consistent features, low-latency real-time pricing, and scalable batch processing. Databricks handles the infrastructure, letting teams focus on pricing intelligence—not on operating the system.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.