cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Why ENABLE_MLFLOW_TRACING does not work for serving endpoint?

d_szepietowska
New Contributor II

I would like to ask you if  you have experienced similar issue like me recently. 

I trained sklearn model. Logged this model with fe.log_model for automatic feature lookup. Online feature tables where published with currently recommended approach, which is Lakebase. I created Serving Endpoint and configured  inference table with two additional settings:

ENABLE_FEATURE_TRACING = true
ENABLE_MLFLOW_TRACING = true

According to documentation Configure access to resources from model serving endpoints | Databricks on AWS
this should allow to log your automatic feature lookup data frame to the inference table.

However only request and predication is saved to inference table.

Do you know any possible reason, why this setup does not work?

3 REPLIES 3

Louis_Frolio
Databricks Employee
Databricks Employee

Hello @d_szepietowska , I did some research on my end and found a few helpful hints/tips to help you troubleshoot. 

Let’s walk through what should be happening, and then I’ll call out the most common reasons the feature lookup DataFrame doesn’t show up in the inference table, even when tracing looks like it’s enabled.

What should happen

When inference tables are enabled on the endpoint and you set the environment variable ENABLE_FEATURE_TRACING=true, the automatic feature lookup DataFrame should be logged to the endpoint’s inference table. This does require MLflow 2.14.0 or newer on the serving side.

For endpoints created starting in February 2025, the platform can also log the augmented DataFrame (that is, the looked-up features plus function return values) into the inference table when configured this way.

Common reasons it doesn’t show up

One of the most frequent causes is endpoint age. If the endpoint was created before the feature shipped (pre-February 2025), it won’t pick up augmented DataFrame logging. In that case, recreating the endpoint (or creating a new one) is required.

Another surprisingly common issue is casing. The value must be the lowercase string “true”, not “True” and not a boolean. Several folks have hit this exact issue, and switching to lowercase “true” immediately fixed it.

Placement matters as well. The environment variable has to be set on the served entity itself (served_entities[n].environment_vars). It won’t work if it’s applied only at a higher or different level of the endpoint config, so it’s worth double-checking it’s on the correct served entity

There’s also some natural delay to account for. Inference table updates are best-effort and can take up to about an hour to appear. This is true for AI Gateway–enabled inference tables as well. If you checked right after scoring, the augmented DataFrame may simply not have landed yet.

Payload size limits can come into play too. Inference table logging has a 1 MiB cap for request, response, and traces. If the augmented DataFrame is large, it may be dropped or truncated, in which case it will show up as null and a logging error code will be set.

If you’re using legacy online tables, that can also block this. Databricks “online tables (legacy)” are deprecated. The current recommendation is Databricks Online Feature Store, which you’ll want to use instead of the legacy path.

Client and logging requirements are another checkpoint. The model needs to have been logged with FeatureEngineeringClient.log_model (or FeatureStoreClient.log_model for legacy setups), and the feature-engineering client version must be 0.3.5 or newer.

Finally, a quick note on ENABLE_MLFLOW_TRACING. That flag controls MLflow trace logging (primarily for GenAI and agent workflows) to MLflow experiments and/or inference tables. It does not, by itself, enable feature lookup DataFrame logging. That behavior is specifically controlled by ENABLE_FEATURE_TRACING. In short: ENABLE_MLFLOW_TRACING is optional and additive, not required for feature lookup logging.

Quick checks and suggested fixes

First, confirm the environment variable is applied exactly as ENABLE_FEATURE_TRACING=“true” (lowercase string) on the served entity, then redeploy the endpoint configuration.

If the endpoint predates February 2025, create a new endpoint (or fully recreate the existing one) with the same model and set ENABLE_FEATURE_TRACING=“true”, then test again.

Verify that inference tables are enabled on the endpoint and keep the log delivery window in mind (up to about an hour). If you have the option, AI Gateway–enabled inference tables are generally the better default going forward.

Lastly, make sure the model was logged with fe.log_model, that your feature-engineering client is version 0.3.5 or newer, and that your features are published to Databricks Online Feature Store rather than legacy online tables.

Here’s the SDK pattern that ensures the environment variable is placed correctly on the served entity:

from databricks.sdk import WorkspaceClient
from databricks.sdk.service.serving import ServedEntityInput, EndpointCoreConfigInput

w = WorkspaceClient()
w.serving_endpoints.create_or_update(
    name="your-endpoint",
    config=EndpointCoreConfigInput(
        served_entities=[
            ServedEntityInput(
                name="your-model",
                entity_name="your.catalog.schema.model_name",
                entity_version="1",
                workload_size="Small",
                scale_to_zero_enabled=True,
                environment_vars={
                    "ENABLE_FEATURE_TRACING": "true"
                }
            )
        ]
    )
)

 

 

Hope this helps, Louis.

Louise, thank you for your answer. I am afraid your suggestions did not help. 

I prepared a case study to check where the root cause is. 

I decided to build single feature store table based on iris dataset and train classification model logged with feature engineering. I decided to run this study in two scenarios:
1. Legacy online table for feature serving (MySql)
2. Online Feature Store (Lakebase)
Then two models were deployed to separate serving endpoints. Both endpoints were defined with GUI, but I made sure to add environment variables.

Both scenarios were executed in notebooks attached to 17.3 ML Runtime. The only upgraded library was databricks-feature-engineering 0.13.01.

Results collected by inference tables differ:

Scenario 1 full response:  

("predictions": [0, 1, 0], "databricks_output": ("trace": ("info": ("trace_id": "tr-691afb519f5cb2c9858a500e5e4d63be", "client_request_id": "6af2957e-9801-485d-879b-bdcd91051537", "trace_location": ("type": "MLFLOW_EXPERIMENT", "mlflow_experiment": ()), "request_time": "2026-01-14T12:36:05.981Z", "state": "OK", "trace_metadata": ("mlflow.trace_schema.version": "3", "mlflow.databricks.modelServingEndpointName": "", "mlflow.modelId": "m-8012ad13733c4500b4511eebe7ae3ecc", "app_version_id": "models:/priv_dorota_szepietowska.tracing.iris_classifier/2", "is_truncated": false), "request_preview": "(\"df\": \" iris_id\\n0 1\\n1 51\\n2 11\", \"params\": (\"result_type\": \"NO_RESULT_TYPE\"), \"output_metrics\": null)", "response_preview": "\"[0 1 0]\"", "execution_duration_ms": 127), "data": ("spans": [("trace_id": "aRr7UZ9cssmFilAOXk1jvg==", "span_id": "ccUrWm9qA0I=", "trace_state": "", "parent_span_id": "", "name": "databricks_feature_store_1", "start_time_unix_nano": 1768394165981587318, "end_time_unix_nano": 1768394166108963499, "attributes": ("mlflow.spanFunctionName": "\"_predict\"", "mlflow.traceRequestId": "\"tr-691afb519f5cb2c9858a500e5e4d63be\"", "mlflow.spanType": "\"RETRIEVER\"", "mlflow.spanOutputs": "\"[0 1 0]\"", "mlflow.spanInputs": "(\"df\": \" iris_id\\n0 1\\n1 51\\n2 11\", \"params\": (\"result_type\": \"NO_RESULT_TYPE\"), \"output_metrics\": null)"), "status": ("message": "", "code": "STATUS_CODE_OK")), ("trace_id": "aRr7UZ9cssmFilAOXk1jvg==", "span_id": "2oV7ckrnZ+E=", "trace_state": "", "parent_span_id": "ccUrWm9qA0I=", "name": "databricks_feature_store_2", "start_time_unix_nano": 1768394165985419589, "end_time_unix_nano": 1768394166108542574, "attributes": ("mlflow.spanFunctionName": "\"_legacy_predict\"", "mlflow.traceRequestId": "\"tr-691afb519f5cb2c9858a500e5e4d63be\"", "mlflow.spanType": "\"RETRIEVER\"", "mlflow.spanOutputs": "\"[0 1 0]\"", "mlflow.spanInputs": "(\"df\": \" iris_id\\n0 1\\n1 51\\n2 11\", \"params\": (\"result_type\": \"NO_RESULT_TYPE\"), \"output_metrics\": null)"), "status": ("message": "", "code": "STATUS_CODE_OK")), ("trace_id": "aRr7UZ9cssmFilAOXk1jvg==", "span_id": "V6Iicbs8G+A=", "trace_state": "", "parent_span_id": "2oV7ckrnZ+E=", "name": "feature_lookup", "start_time_unix_nano": 1768394165990067826, "end_time_unix_nano": 1768394166097124278, "attributes": ("mlflow.spanFunctionName": "\"_monitored_augment_with_materialized_features\"", "mlflow.traceRequestId": "\"tr-691afb519f5cb2c9858a500e5e4d63be\"", "mlflow.spanType": "\"RETRIEVER\"", "mlflow.spanOutputs": "\" sepal_length sepal_width petal_length petal_width iris_id\\n0 4.9 3.0 1.4 0.2 1\\n1 6.4 3.2 4.5 1.5 51\\n2 4.8 3.4 1.6 0.2 11\"", "mlflow.spanInputs": "(\"df\": \" iris_id\\n0 1\\n1 51\\n2 11\", \"features_to_lookup\": [\"<FeatureColumnInfo: default_value_str=None, feature_name='petal_length', lookup_key=['iris_id'], output_name='petal_length', table_name='priv_dorota_szepietowska.tracing.iris_features', timestamp_lookup_key=[]>\", \"<FeatureColumnInfo: default_value_str=None, feature_name='petal_width', lookup_key=['iris_id'], output_name='petal_width', table_name='priv_dorota_szepietowska.tracing.iris_features', timestamp_lookup_key=[]>\", \"<FeatureColumnInfo: default_value_str=None, feature_name='sepal_length', lookup_key=['iris_id'], output_name='sepal_length', table_name='priv_dorota_szepietowska.tracing.iris_features', timestamp_lookup_key=[]>\", \"<FeatureColumnInfo: default_value_str=None, feature_name='sepal_width', lookup_key=['iris_id'], output_name='sepal_width', table_name='priv_dorota_szepietowska.tracing.iris_features', timestamp_lookup_key=[]>\"], \"partially_overridden_feature_output_names\": [], \"output_metrics\": null)"), "status": ("message": "", "code": "STATUS_CODE_OK")), ("trace_id": "aRr7UZ9cssmFilAOXk1jvg==", "span_id": "RoUPPNFbTxs=", "trace_state": "", "parent_span_id": "V6Iicbs8G+A=", "name": "online_feature_store", "start_time_unix_nano": 1768394165995334193, "end_time_unix_nano": 1768394166089807496, "attributes": ("mlflow.spanFunctionName": "\"_send_req\"", "mlflow.traceRequestId": "\"tr-691afb519f5cb2c9858a500e5e4d63be\"", "mlflow.spanType": "\"RETRIEVER\"", "mlflow.spanOutputs": "(\"results\": (\"priv_dorota_szepietowska.tracing.iris_features_online\": (\"schema\": (\"columns\": [(\"name\": \"petal_width\", \"type_name\": \"DOUBLE\", \"nullable\": true), (\"name\": \"sepal_length\", \"type_name\": \"DOUBLE\", \"nullable\": true), (\"name\": \"petal_length\", \"type_name\": \"DOUBLE\", \"nullable\": true), (\"name\": \"sepal_width\", \"type_name\": \"DOUBLE\", \"nullable\": true), (\"name\": \"iris_id\", \"type_name\": \"LONG\", \"nullable\": false)]), \"rows\": [[0.2, 4.9, 1.4, 3.0, \"1\"], [0.2, 4.8, 1.6, 3.4, \"11\"], [1.5, 6.4, 4.5, 3.2, \"51\"]])))", "mlflow.spanInputs": "(\"page_token\": null, \"is_retry\": false)"), "status": ("message": "", "code": "STATUS_CODE_OK")), ("trace_id": "aRr7UZ9cssmFilAOXk1jvg==", "span_id": "cr4DwPsLcRs=", "trace_state": "", "parent_span_id": "2oV7ckrnZ+E=", "name": "custom_model", "start_time_unix_nano": 1768394166098174878, "end_time_unix_nano": 1768394166107880298, "attributes": ("mlflow.traceRequestId": "\"tr-691afb519f5cb2c9858a500e5e4d63be\"", "mlflow.spanType": "\"RETRIEVER\"", "mlflow.spanInputs": "(\"model_input\": [(\"sepal_length\": 4.9, \"sepal_width\": 3.0, \"petal_length\": 1.4, \"petal_width\": 0.2), (\"sepal_length\": 6.4, \"sepal_width\": 3.2, \"petal_length\": 4.5, \"petal_width\": 1.5), (\"sepal_length\": 4.8, \"sepal_width\": 3.4, \"petal_length\": 1.6, \"petal_width\": 0.2)])"), "status": ("message": "", "code": "STATUS_CODE_OK"))])), "databricks_request_id": "6af2957e-9801-485d-879b-bdcd91051537"))

 

Scenario 2 full response: ("predictions": [0, 1, 0], "databricks_output": ("trace": ("info": ("trace_id": "tr-5abe4d6637f0d4bcc08607be4d7048bc", "client_request_id": "7c81af0b-9532-41c6-86c1-d35f2efa2a9b", "trace_location": ("type": "MLFLOW_EXPERIMENT", "mlflow_experiment": ()), "request_time": "2026-01-14T12:36:23.204Z", "state": "OK", "trace_metadata": ("mlflow.trace_schema.version": "3", "mlflow.databricks.modelServingEndpointName": "", "mlflow.modelId": "m-69ae367a222249738bd8561e457055c0", "app_version_id": "models:/priv_dorota_szepietowska.tracing_lakebase.iris_classifier/3", "is_truncated": false), "request_preview": "(\"df\": \" iris_id\\n0 1\\n1 51\\n2 11\", \"params\": (\"result_type\": \"NO_RESULT_TYPE\"), \"output_metrics\": null)", "response_preview": "\"[0 1 0]\"", "execution_duration_ms": 23), "data": ("spans": [("trace_id": "Wr5NZjfw1LzAhge+TXBIvA==", "span_id": "rclW4BIjkcM=", "trace_state": "", "parent_span_id": "", "name": "databricks_feature_store", "start_time_unix_nano": 1768394183204243150, "end_time_unix_nano": 1768394183228014574, "attributes": ("mlflow.spanType": "\"RETRIEVER\"", "mlflow.spanFunctionName": "\"_predict\"", "mlflow.traceRequestId": "\"tr-5abe4d6637f0d4bcc08607be4d7048bc\"", "mlflow.spanInputs": "(\"df\": \" iris_id\\n0 1\\n1 51\\n2 11\", \"params\": (\"result_type\": \"NO_RESULT_TYPE\"), \"output_metrics\": null)", "mlflow.spanOutputs": "\"[0 1 0]\""), "status": ("message": "", "code": "STATUS_CODE_OK"))])), "databricks_request_id": "7c81af0b-9532-41c6-86c1-d35f2efa2a9b"))

 

As you can see in the second scenario there is no information about feature values used for scoring.

These are the examples of saved responses for two variables set to true: ENABLE_MLFLOW_TRACING and ENABLE_FEATURE_TRACING. When I tested endpoints with only ENABLE_FEATURE_TRACING variable set to true, response did not include feature values for both endpoints and looked like this: "predictions": [0, 1, 2]

Let me know what your opinion is. Do I miss something?

 

SteveOstrowski
Databricks Employee
Databricks Employee

Hi @d_szepietowska,

Thank you for the detailed investigation, especially the side-by-side comparison between the legacy online table (MySQL) and the Lakebase-backed Online Feature Store. That is very helpful for narrowing down the behavior.

UNDERSTANDING THE TWO ENVIRONMENT VARIABLES

These two flags serve different purposes:

1. ENABLE_FEATURE_TRACING = "true" controls whether the automatic feature lookup augmented DataFrame (the looked-up feature values joined with your input) gets logged to the inference table. This is the one documented specifically for feature store tracing.

2. ENABLE_MLFLOW_TRACING = "true" controls MLflow trace logging, which captures the full span-level trace data (inputs, outputs, intermediate steps) and writes it to MLflow experiments and/or inference tables. This was originally designed for GenAI/agent observability, but it also captures feature store spans when the model uses automatic feature lookup.

Your test results confirm an important interaction: when only ENABLE_FEATURE_TRACING is set, neither scenario returned feature values in the response body. When both variables were set, the legacy online table path produced the full trace with feature lookup spans, but the Lakebase path did not include the detailed feature values in the trace spans.

WHAT YOUR RESULTS SHOW

In Scenario 1 (legacy online table), the trace includes a "feature_lookup" span with mlflow.spanOutputs containing the full augmented DataFrame with sepal_length, sepal_width, petal_length, petal_width values. The trace also includes an "online_feature_store" span showing the raw response from the online store.

In Scenario 2 (Lakebase), the trace only has a single "databricks_feature_store" span at the top level with no nested feature_lookup or online_feature_store child spans. The feature values are absent from the trace output.

This indicates that the Lakebase code path for automatic feature lookup does not yet emit the same granular trace spans as the legacy online table path. The underlying feature lookup still works (your predictions are correct, meaning features were looked up successfully), but the tracing instrumentation for the Lakebase path may not be fully producing the same span detail.

RECOMMENDED NEXT STEPS

1. Verify your databricks-feature-engineering library version. You mentioned 0.13.01, which should support tracing. Confirm it is the latest available version on your runtime, as newer patch releases may include improvements to Lakebase trace instrumentation.

2. Confirm the environment variable values are lowercase strings. Both must be exactly "true" (not "True" or a boolean). You can verify via the API:

GET /api/2.0/serving-endpoints/{endpoint_name}

Check that served_entities[].environment_vars shows exactly:
"ENABLE_FEATURE_TRACING": "true"
"ENABLE_MLFLOW_TRACING": "true"

3. Try recreating the endpoint. If it was created before February 2025, augmented DataFrame logging to inference tables requires a newly created endpoint. Even if you believe it is new, recreating it ensures you pick up the latest platform-side instrumentation.

4. Check the inference table directly (not just the response body). The augmented DataFrame may be logged to the inference table even when it does not appear in the synchronous response payload. Query your inference table and look for a column containing trace or feature data:

SELECT * FROM your_catalog.your_schema.your_inference_table
ORDER BY timestamp DESC
LIMIT 10

5. Consider opening a support ticket. Your comparison is solid evidence that the Lakebase code path behaves differently from the legacy path regarding trace span generation. A support ticket with your two-scenario comparison would help the product team investigate whether this is a gap in the current Lakebase tracing instrumentation that will be addressed.

RELEVANT DOCUMENTATION

- Configure environment variables on serving endpoints:
https://docs.databricks.com/aws/en/machine-learning/model-serving/store-env-variable-model-serving

- Automatic feature lookup:
https://docs.databricks.com/aws/en/machine-learning/feature-store/automatic-feature-lookup

- Databricks Online Feature Store (Lakebase):
https://docs.databricks.com/aws/en/machine-learning/feature-store/online-tables

- MLflow Tracing:
https://docs.databricks.com/aws/en/mlflow/mlflow-tracing

* This reply used an agent system I built to research and draft this response based on the wide set of documentation I have available and previous memory. I personally review the draft for any obvious issues and for monitoring system reliability and update it when I detect any drift, but there is still a small chance that something is inaccurate, especially if you are experimenting with brand new features.

If this answer resolves your question, could you mark it as "Accept as Solution"? That helps other users quickly find the correct fix.