Is there an inbuilt method to measure latency metrics like TTFT, TBT when deploying agents on Databricks? Using MLFlow ChatAgent, ChatDatabricks/OpenAI client(workspace client)What would be the way to measure them in case no inbuilt method exists?
Code works fine locally but deployment in serving endpoint gives me below error at runtime:{"error_code": "BAD_REQUEST", "message": "Encountered an unexpected error while evaluating the model. Verify that the input is compatible with the model for in...
Thanks for your response Louis. If I understand it correctly, for production monitoring, we would have to rely on client side logging. Can mlflow.log_metric be integrated with traces by any chance? (Since that seems to be the only way to measure TTFT...