cancel
Showing results for 
Search instead for 
Did you mean: 
Generative AI
Explore discussions on generative artificial intelligence techniques and applications within the Databricks Community. Share ideas, challenges, and breakthroughs in this cutting-edge field.
cancel
Showing results for 
Search instead for 
Did you mean: 

Tracing through model serving endpoint

srijan1881
New Contributor

 i have deployed a code running on langgraph through model serving endpoint. I want to trace the logs using ml flow and i want to trace logs in the experiment whenever a user hits the serving endpoint. I have defined both of them in my code

mlflow.set_experiment("/xxx")
mlflow.openai.autolog(
    disable=False,
)
 
and also set 
"ENABLE_MLFLOW_TRACING": "true",
but still couldn't able to see logs in experiment
3 REPLIES 3

iyashk-DB
Databricks Employee
Databricks Employee

Hi @srijan1881 ,
what do you mean by logs here? If you meant tracing step by step invocations etc in the model serving side. You need to add these environment variables to the served model (Serving > your endpoint > Edit endpoint > Environment variables), then restart the endpoint:

  • ENABLE_MLFLOW_TRACING=true
  • MLFLOW_EXPERIMENT_ID=<the numeric Experiment ID, not the path>
  • Auth for the endpoint to write to the experiment:
    Either DATABRICKS_HOST and DATABRICKS_TOKEN (PAT), or DATABRICKS_CLIENT_ID and DATABRICKS_CLIENT_SECRET (Service Principal). The identity must have CAN_EDIT on the target experiment.

Ref Doc - https://docs.databricks.com/aws/en/mlflow3/genai/tracing/prod-tracing

Also ensure that mlflow[databricks] >= 3.1

Hi @iyashk-DB, I referred to your other responses / clarifications in the community posts whilst looking for a solution.
Are you Yashwanth Kiran from Amrita Vishwa Vidyapeetham university?

SteveOstrowski
Databricks Employee
Databricks Employee

Hi @srijan1881,

The behavior you are seeing is expected when using a manually created model serving endpoint rather than one deployed through the Databricks Agent Framework. Here is a breakdown of why traces are not appearing and how to resolve it.

UNDERSTANDING THE ISSUE

When you deploy code to a model serving endpoint directly (not via the Agent Framework), setting mlflow.set_experiment() and mlflow.openai.autolog() in your model code does not automatically result in traces being written to that experiment. The model serving container environment has restrictions on which MLflow operations can write back to the tracking server, and the ENABLE_MLFLOW_TRACING environment variable alone is not sufficient to enable full experiment-level trace logging for custom-deployed endpoints.

RECOMMENDED APPROACH: USE THE AGENT FRAMEWORK

The supported way to get real-time MLflow tracing from a model serving endpoint is to deploy your LangGraph agent using the Databricks Agent Framework. This approach automatically configures tracing so that all interactions are logged to an MLflow experiment in real time.

1. Wrap your LangGraph agent as an MLflow model

Make sure your agent conforms to the MLflow ChatModel or ResponsesAgent interface. For a LangGraph-based agent, you can wrap it using mlflow.pyfunc.PythonModel or the newer mlflow.models.ChatModel / ResponsesAgent interface. Then log it to Unity Catalog:

import mlflow

mlflow.set_registry_uri("databricks-uc")

with mlflow.start_run():
model_info = mlflow.langchain.log_model(
lc_model="/path/to/your/langgraph/agent",
artifact_path="langgraph_agent",
registered_model_name="catalog.schema.your_agent_model"
)

2. Deploy using agents.deploy()

Install the required packages:

%pip install mlflow>=3.1.3 databricks-agents>=1.1.0
dbutils.library.restartPython()

Then deploy:

import mlflow
from databricks import agents

mlflow.set_experiment("/Users/your_email/your_experiment_name")

deployment = agents.deploy(
"catalog.schema.your_agent_model",
model_version=1
)

3. Verify tracing

After deployment (which can take up to 15 minutes), send a request to the endpoint. Traces should appear in the MLflow experiment you specified via mlflow.set_experiment() before calling agents.deploy(). They will also be written to AI Gateway inference tables automatically for long-term retention.

IMPORTANT NOTES

- Set the experiment before calling agents.deploy(), not inside your model code. The experiment must be set in the notebook or script that calls deploy().

- If you are deploying from a notebook inside a Databricks Git folder, real-time tracing will not work by default. You need to set the experiment to a path that is not associated with a Git folder before calling agents.deploy(). For example:

mlflow.set_experiment("/Users/your_email/tracing_experiment")

- All agents sharing the same endpoint will write traces to the same experiment.

- Traces are also written to inference tables automatically. You can find these in the Unity Catalog under the schema associated with your endpoint.

IF YOU MUST USE A CUSTOM ENDPOINT (NOT AGENT FRAMEWORK)

If you need to keep your current custom model serving setup, traces from the serving container are captured in inference tables rather than in an MLflow experiment. You can:

1. Enable inference tables on your endpoint through the serving endpoint configuration UI or API.

2. Query the inference table to view request/response logs as a Delta table in Unity Catalog.

3. For full MLflow experiment-level tracing with custom endpoints, consider adding manual trace logging in your model's predict() method using the mlflow-tracing lightweight package and sending traces asynchronously. However, the Agent Framework path is the most straightforward and fully supported approach.

DOCUMENTATION REFERENCES

- MLflow Tracing overview: https://docs.databricks.com/aws/en/mlflow/mlflow-tracing.html
- Deploy agents with Agent Framework: https://docs.databricks.com/aws/en/generative-ai/agent-framework/deploy-agent.html
- Inference tables for model serving: https://docs.databricks.com/aws/en/machine-learning/model-serving/inference-tables.html
- Environment variables for model serving: https://docs.databricks.com/aws/en/machine-learning/model-serving/store-env-variable-model-serving.h...
- MLflow Tracing instrumentation: https://docs.databricks.com/aws/en/mlflow3/genai/tracing/app-instrumentation/

* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.