cancel
Showing results for 
Search instead for 
Did you mean: 
Get Started Discussions
Start your journey with Databricks by joining discussions on getting started guides, tutorials, and introductory topics. Connect with beginners and experts alike to kickstart your Databricks experience.
cancel
Showing results for 
Search instead for 
Did you mean: 

Software engineering in data bricks

DBXDeveloper111
New Contributor

I'm a software engineer and a bit new to databricks.  My goal is to create a model serving endpoint, that interfaces with several ML models. Traditionally this would look like:

API--> Service --> Data

Now using databricks, my understanding is that it will look like

Models Serving Endpoint --> Service Model --> ML Model

From a best practices perspective what is the best way to deploy? A single dab that bundles the resources to a single cluster? Multiple deployed models/clusters in more of a micro service fashion? 

Also is the service model even necessary? 

I can see benefits to each method. I'm certain there are aspects I'm overlooking.  I'd love to hear how others are deploying

 

1 REPLY 1

iyashk-DB
Databricks Employee
Databricks Employee

Hi @DBXDeveloper111 ,

A Model Serving endpoint is the “service”: it exposes a REST API and handles autoscaling on serverless compute. You don’t manage clusters for online inference. Each endpoint hosts one or more served entities (models/functions), which you reference and route to by name and version. You configure these in the endpoint’s served_entities section (via UI, REST, SDK, or MLflow Deployments). A separate “service model” is not required. Pre/post‑processing can live inside the model wrapper (MLflow pyfunc) or as a function/agent deployed to Model Serving if you need to orchestrate multiple backends.

  • Use multiple endpoints when models have different SLOs, hardware (CPU/GPU), scaling, or blast‑radius needs; endpoints are serverless and autoscale independently.
  • Use a single multi‑model endpoint for A/B or canary when models share similar runtimes; split traffic or hit a specific served model path; you can’t mix different model types in one endpoint.
  • Add an orchestrator only if a single API call must coordinate multiple models/tools; deploy a simple function/agent on Model Serving and keep the client contract stable.

You only need a separate service layer if you’re coordinating multiple models/tools or enforcing cross‑cutting policies that don’t fit neatly in one model’s code. In that case, deploy an orchestrator function/agent to Model Serving and keep the client contract stable.

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now