Re: Unable to add dependencies to mlflow.langchain...

stbjelcevic · ‎11-10-2025

Thanks for the details—this is a common gotcha with MLflow “models from code.”

Why your imports fail

code_paths are only added to sys.path when the model is loaded (for inference/serving), not while the driver is executing mlflow.langchain.log_model(...) to log the model.
In the code-based logging flow, MLflow runs your lc_model file (deploy_chain.py) during logging; any imports inside that file must already be importable in the current notebook/cluster environment at logging time.

Make Python modules importable at logging time

Pick one of these:

Add the directory containing your helper modules (example_docs.py) to sys.path before calling log_model:

import sys
sys.path.append("/Workspace/Users/<user>/exp")  # folder that contains example_docs.py

import mlflow

with mlflow.start_run(run_name="run1"):
    logged_chain_info = mlflow.langchain.log_model(
        lc_model="/Workspace/Users/<user>/exp/deploy_chain.py",
        model_config="/Workspace/Users/<user>/exp/chain_config.yaml",
        artifact_path="exp_1_artifact",
        input_example=input_example,
        example_no_conversion=True,
        # include the whole directory so it’s available at load/serve time too
        code_paths=["/Workspace/Users/<user>/exp"],
    )

This ensures import example_docs inside deploy_chain.py resolves during the logging step, and the directory is also packaged for serving.

Preferably, use a Databricks Repo and install your package in the logging environment:
```
# One-time per cluster, or in your notebook before log_model
%pip install -e /Workspace/Repos/<your_repo>/  # has pyproject.toml/setup.py

# then log as usual
```
This is the most reliable way to satisfy imports both at logging and serving time, as MLflow will capture and restore package dependencies.

Sources: