Authors: Sean Wilkinson, Ashwin Srikant, Cathy Zdravevski
Databricks Model Serving provides a scalable, low-latency hosting service for AI models. It supports models ranging from small custom models to best-in-class large language models (LLMs). In this blog we’ll describe the pricing model associated with Databricks Model Serving and demonstrate how to allocate costs per endpoint or per use case.
Databricks Model Serving now includes three distinct pricing methods. Regardless of the method you choose, the price is inclusive of all cloud infrastructure costs. The three different methods are covered briefly here:
The best way to track model servings costs in Databricks is through the billable usage system table. Once enabled, the table automatically populates with the latest usage in your Databricks account. No matter which of the three model serving methods you choose, your costs will appear in the system.billing.usage table with column sku_name as either:
<tier>_SERVERLESS_REAL_TIME_INFERENCE_LAUNCH_<region>
which includes all DBUs accrued when an endpoint starts after scaling to zero. All other model serving costs are grouped under:
<tier>_SERVERLESS_REAL_TIME_INFERENCE_<region>
where tier corresponds to your Databricks platform tier and region corresponds to the cloud region of your Databricks deployment.
You can easily query the system.billing.usage table to aggregate all DBUs (Databricks Units) associated with Databricks model serving. Here is an example query that aggregates model serving DBUs per day for the last 30 days:
|
Aggregated costs may be sufficient for simple use cases, but as the number of endpoints grows it is desirable to break out costs based on use case, business unit, or other custom identifiers. Optional key/value tags can be applied to custom models endpoints. All custom tags applied to Databricks Model Serving endpoints propagate to the system.billing.usage table under the custom_tags column and can be used to aggregate and visualize costs. Databricks recommends adding descriptive tags to each endpoint for precise cost tracking.
Below is an example query that separates model serving costs by values of a specific tag for the Databricks account over the last 30 days.
|
Running the query in the Databricks SQL Editor breaks out model serving costs by value of the tag over the past month:
This is just the start of what you can view and visualize using the system.billing.usage tables in Databricks! Stay tuned as Databricks plans to roll out additional tables and metrics within the system catalog.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.