Resolved! How to get MLflow OpenAI autolog traces from PySpark mapInPandas workers (and some pitfalls)
ContextI'm running an LLM pipeline on Databricks that distributes OpenAI API calls across Spark workers via mapInPandas. Getting mlflow.openai.autolog() to work on workers required solving three undocumented issues. Sharing here since I couldn't find...
- 146 Views
- 1 replies
- 1 kudos
Latest Reply
Greetings @Jayachithra , I did some digging and came up with some helpful tips/hints to help you along. On Issue 1 (explicit MLflow context): expected behavior once you realize that mapInPandas spawns isolated Python worker processes, not threads. ...
- 1 kudos