cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Model Serving Latency Chart

Kaizen
Valued Contributor

Hi, 

For the model serving latency graph what is p50 and p99? I only have one model i am serving on this endpoing so im surprised to see two models being tracked

 

Kaizen_0-1714504038212.png

 

2 REPLIES 2

Kaizen
Valued Contributor

If im not mistaken this refers to 50% of responses and 99% responses and averages accordingly for the metrics?

 

@s_park 
@Sujitha 
@Debayan 

shan_chandra
Esteemed Contributor
Esteemed Contributor

@Kaizen - Please refer to the below explanation.

In a model latency chart, P50 and P99 represent the median and 99th percentile round-trip latency times respectively.- P50 (Latency at 50th percentile) is the median latency, meaning that 50% of the requests have a latency that is less than this value and 50% have a latency that is greater.
- P99 (Latency at 99th percentile) is the value below which 99% of the observations may be found. In other words, only 1% of the requests have a latency that is greater than this value.These metrics are used to understand the distribution of latency and to identify outliers or abnormal behavior in system performance.

Reference: https://docs.databricks.com/en/machine-learning/model-serving/metrics-export-serving-endpoint.html#s...

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group