cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Population stability index (PSI) calculation in Lakehouse monitor

Danik
New Contributor

Hi! We are using Lakehouse monitoring for detecting data drift in our metrics. However, the exact calculation of metrics is not documented anywhere (I couldnt find it) and it raises questions on how they are done, in our case especially - PSI. 

I would like to ask following questions (descending priority order):
1. Is it somewhere to find the documentation regarding the implementation of PSI and other metrics?

2. We have a case, where for two different metrics (F1 and recall accordingly) avg_delta and wasserstein_distance are equal to ~0.01, but PSI for one metric is 0.02, and for the other one is ~2.2. I understand, that its possible due to the binning, but it would be much more insightful, if we could see the algorithm and see why it happens.

3. We have a test case, where we compare two same distributions/arrays. For two diffent metrics (F1 and recall accordingly), the avg_delta is 2.0E-16 and 0. Wasserstein is 0 for both cases, PSI for both cases is 0.041. We can only assume, that the non-zero values, are emerged due to the rounding error, but seeing the underlying algorithm would benefit us a lot. 

Thanks in advance!

1 REPLY 1

iyashk-DB
Databricks Employee
Databricks Employee

Hi @Danik , I have reviewed this.

1) Is there documentation for PSI and other metrics?
Public docs list PSI in the drift table and give thresholds, but don’t detail the exact algorithm.
Internally, numeric PSI uses ~1000 quantiles, equal‑height binning on the baseline, plus a tiny smoothing epsilon for empty bins.

2) Why tiny avg_delta/Wasserstein (~0.01) but PSI differs (0.02 vs ~2.2)?
Different sensitivity: small shifts crossing baseline quantile bin edges can make bin proportions change a lot, so PSI jumps while mean/Wasserstein stay small.
Metrics like F1/recall can cluster near 0 or 1, so bin‑edge effects hit harder and inflate PSI.

3) Why identical arrays give PSI ≈ 0.041 (but Wasserstein=0, avg_delta≈0)?
Quantile‑based approximation + smoothing means PSI can be slightly >0 even when others read as 0.
Double‑check same windows/counts; PSI uses per‑bin proportions and quantile arrays may differ a hair, preventing an exact zero.

Ref Docs - https://docs.databricks.com/aws/en/data-quality-monitoring/data-profiling/monitor-output
One good external doc about PSI - https://medium.com/model-monitoring-psi/population-stability-index-psi-ab133b0a5d42