Unable to add dependencies to mlflow.langchain.log_model
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-07-2024 03:32 PM
Hello,
I'm doing this-
with mlflow.start_run(run_name="run1"):
logged_chain_info = mlflow.langchain.log_model(
# lc_model=os.path.join(os.getcwd(), 'full_chain'), this doesnt work too
lc_model='/Workspace/Users/{user_name}/exp/deploy_chain.py',
model_config="/Workspace/Users/{user_name}/exp/chain_config.yaml",
artifact_path="exp_1_artifact",
input_example=input_example,
example_no_conversion=True,
code_paths=["/Workspace/Users/{user_name}/exp/example_docs.py"]
)But when I do import example_docs in deploy_chain.py it says module not found when I run the above code.
Similarly,
If I try to add a pdf/image file in code_paths and try to access it using relative path it fails on the mlflow run step, If I give absolute path, it fails while serving the endpoint (file not found).
How should I add dependent files to this?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-10-2025 12:25 PM
Hi @yashshingvi ,
Thanks for the details—this is a common gotcha with MLflow “models from code.”
Why your imports fail
- code_paths are only added to
sys.pathwhen the model is loaded (for inference/serving), not while the driver is executingmlflow.langchain.log_model(...)to log the model. - In the code-based logging flow, MLflow runs your
lc_modelfile (deploy_chain.py) during logging; any imports inside that file must already be importable in the current notebook/cluster environment at logging time.
Make Python modules importable at logging time
Pick one of these:
-
Add the directory containing your helper modules (example_docs.py) to
sys.pathbefore callinglog_model:import sys sys.path.append("/Workspace/Users/<user>/exp") # folder that contains example_docs.py import mlflow with mlflow.start_run(run_name="run1"): logged_chain_info = mlflow.langchain.log_model( lc_model="/Workspace/Users/<user>/exp/deploy_chain.py", model_config="/Workspace/Users/<user>/exp/chain_config.yaml", artifact_path="exp_1_artifact", input_example=input_example, example_no_conversion=True, # include the whole directory so it’s available at load/serve time too code_paths=["/Workspace/Users/<user>/exp"], )This ensures
import example_docsinside deploy_chain.py resolves during the logging step, and the directory is also packaged for serving. -
Preferably, use a Databricks Repo and install your package in the logging environment:
# One-time per cluster, or in your notebook before log_model %pip install -e /Workspace/Repos/<your_repo>/ # has pyproject.toml/setup.py # then log as usualThis is the most reliable way to satisfy imports both at logging and serving time, as MLflow will capture and restore package dependencies.
Sources: