Databricks Community

getsome · ‎07-11-2025

I’m building a custom UI table (using Next.js and FastAPI) to display MLflow trace data from a Retrieval-Augmented Generation (RAG) application running on Databricks Managed MLflow 3.0. The table needs to show answer generation speed (from CHAT_MODEL spans), query processing time (from RETRIEVER spans), user ID, prompt template ID, and user feedback (good/bad rating and comments, submitted asynchronously via an API). I’m planning to store this data in a Delta Table to support sorting, filtering, and pagination, as the default MLflow trace limit (100,000 per workspace) and search_traces() rate limit (25 QPS) may not scale for my use case.

I’m considering two approaches to sync MLflow traces and user feedback with the Delta Table:

1. Streaming with Delta Live Tables (DLT): Stream new traces and feedback into a Delta Table for near-real-time updates, given that feedback can arrive anytime after a query. Is DLT the best approach for this, and how can I efficiently ingest MLflow traces (stored in the managed backend) and feedback (stored in a separate Delta Table) into a unified Delta Table? Are there best practices for setting up the pipeline to handle high trace volumes and asynchronous feedback?

2. Scheduled Cron Job: Use a Databricks Workflow to periodically fetch new traces and feedback, merging them into the Delta Table. Would this be sufficient for a high-volume application, or will the search_traces() rate limit cause issues?

Key questions:

• Does MLflow or Databricks provide a native way to stream traces to a Delta Table, or do I need to export traces to an intermediate format (e.g., Parquet on DBFS)?

• How can I efficiently associate asynchronous user feedback (linked by trace_id) with traces in the Delta Table?

• Are there performance considerations or best practices for querying the Delta Table with Spark SQL or Databricks SQL to power a responsive UI table with sorting, filtering, and pagination?

• If I approach the 100,000-trace limit, what’s the process for requesting an increase, and how does it impact streaming or batch syncing?

Any example code, pipeline configurations, or recommendations for DLT vs. cron jobs would be greatly appreciated. Thanks!

sarahbhord · ‎09-30-2025

Hello! Here are the answers to your questions:

- Yes! See databricks managed mlflow tracing - enable production monitor or endpoint config to collect traces in a delta table

- We have example code for implementing async feedback collection

- Definitely. See a comprehensive summary of considerations here. You can drill down into specific pieces of your workflow that you want to speed up - using caching and other techniques.

- This is something you can take up directly with your Account Team. If you do not have one, you can make the formal request through Databricks support. As you near the 100,000-trace limit in MLflow, new traces may be rejected if rolling deletion isn’t enabled, disrupting streaming and batch syncing. If rolling deletion is active, old traces are purged to allow new ones, reducing historical data retention. High trace volumes near the limit can slow performance or cause errors for both streaming and batch jobs. Scaling resources and enabling rolling deletion help minimize issues.

I hope this helps.

Best,

Sarah

View solution in original post

sarahbhord · ‎09-30-2025

Hello! Here are the answers to your questions:

- Yes! See databricks managed mlflow tracing - enable production monitor or endpoint config to collect traces in a delta table

- We have example code for implementing async feedback collection

- Definitely. See a comprehensive summary of considerations here. You can drill down into specific pieces of your workflow that you want to speed up - using caching and other techniques.

- This is something you can take up directly with your Account Team. If you do not have one, you can make the formal request through Databricks support. As you near the 100,000-trace limit in MLflow, new traces may be rejected if rolling deletion isn’t enabled, disrupting streaming and batch syncing. If rolling deletion is active, old traces are purged to allow new ones, reducing historical data retention. High trace volumes near the limit can slow performance or cause errors for both streaming and batch jobs. Scaling resources and enabling rolling deletion help minimize issues.

I hope this helps.

Best,

Sarah

Databricks Community

How to Efficiently Sync MLflow Traces and Asynchronous User Feedback with a Delta Table

Join Us as a Local Community Builder!

Join us for another BrickTalk: Vibe-Coding Databricks Apps in Replit with Augusto!

🌟 Community Pulse: Your Weekly Roundup! November 14 – 20, 2025

Celebrating Our First Brickster Champion: Louis Frolio

⭐ Setup Spark with Hadoop Anywhere : A DBR aligned local Spark+HDFS+Hive stack on Docker⭐

Big Book of Data Engineering - Get how-tos, code snippets and real-world examples