Databricks Community

ecram · ‎06-30-2025

Hi everyone,

I've recently deployed a custom model using Databricks Model Serving with AI Gateway-enabled inference tables. The model is built with:

Python 3.11.11
LightGBM 4.5.0
MLflow 2.13.1

I’ve noticed that the inference logs can take up to 1 hour to appear, as mentioned in the Databricks documentation. This is quite different from a previous setup (Python 3.10.12, LightGBM 3.3.5, MLflow 2.5.0) where logs appeared in ~5 minutes using legacy inference tables.

Question:
Is there any way to reduce the latency of inference logs when using AI Gateway-enabled inference tables?

I understand the system is now based on batch delivery, but I’d like to know if:

There are configuration options to speed this up?
There’s any official roadmap to reduce this latency?
Any best practices to implement near real-time logging (e.g., logging predictions manually into a Delta table within the model wrapper)?

Thanks in advance for your help!
Marcelo

Kumaran · ‎08-21-2025

Hi @ecram,

Thank you for contacting Databricks community.

As per the doc below, you'll see the latency for 1 hour for log delivery in the inference table.

https://docs.databricks.com/aws/en/ai-gateway/inference-tables#:~:text=You%20can%20expect%20logs%20t....

Databricks Community

How to Reduce Log Latency for AI Gateway-Enabled Inference Tables in Model Serving?

Join Us as a Local Community Builder!

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

🌟 Community Pulse: Your Weekly Roundup! November 14 – 20, 2025

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples