cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to Reduce Log Latency for AI Gateway-Enabled Inference Tables in Model Serving?

ecram
New Contributor

Hi everyone,

I've recently deployed a custom model using Databricks Model Serving with AI Gateway-enabled inference tables. The model is built with:

  • Python 3.11.11

  • LightGBM 4.5.0

  • MLflow 2.13.1

I’ve noticed that the inference logs can take up to 1 hour to appear, as mentioned in the Databricks documentation. This is quite different from a previous setup (Python 3.10.12, LightGBM 3.3.5, MLflow 2.5.0) where logs appeared in ~5 minutes using legacy inference tables.

Question:
Is there any way to reduce the latency of inference logs when using AI Gateway-enabled inference tables?

I understand the system is now based on batch delivery, but I’d like to know if:

  • There are configuration options to speed this up?

  • There’s any official roadmap to reduce this latency?

  • Any best practices to implement near real-time logging (e.g., logging predictions manually into a Delta table within the model wrapper)?

Thanks in advance for your help!
Marcelo

1 REPLY 1

Kumaran
Databricks Employee
Databricks Employee

Hi @ecram,

Thank you for contacting Databricks community.

As per the doc below, you'll see the latency for 1 hour for log delivery in the inference table.

https://docs.databricks.com/aws/en/ai-gateway/inference-tables#:~:text=You%20can%20expect%20logs%20t....

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now