Louis_Frolio
Databricks Employee
Databricks Employee

Greetings @hawa ,  Thanks for sharing the details—this looks like a combination of registration and configuration issues that commonly surface with the MLflow LangChain flavor on Databricks.

What’s going wrong

  • The registered model name should be a full three-level Unity Catalog path like <catalog>.<schema>.<model>. Using just "model1" causes registration/serving mismatches and can lead to “not successfully registered” errors when serving from UC.
  • The LangChain flavor needs chain type info in the logged model’s config so it can reconstruct the chain at load/serve time. Without it, you get “Must specify a chain Type in config.” The fix is to pass model_config={"chain_type": "stuff"} (or whatever you used) when calling mlflow.langchain.log_model(...) so the MLflow artifact contains the chain’s type for serving.
  • It’s best to validate the model before serving by loading the model back and invoking it (or using mlflow.models.predict) to ensure the runtime and signature behave as expected.

Fix:

log and register correctly, then validate Below is a minimal pattern that addresses all three points.
from mlflow.models import infer_signature
import mlflow
import langchain

# 1) Use a full UC name
CATALOG = "prod"
SCHEMA = "ai_apps"
MODEL_BASENAME = "model1"
REGISTERED_MODEL_NAME = f"{CATALOG}.{SCHEMA}.{MODEL_BASENAME}"

mlflow.set_registry_uri("databricks-uc")

# Assume you already built `chain` (with your chain_type="stuff") and have a loader_fn (e.g., get_retriver)
question = {"query": "Hello"}  # keep your input schema consistent with how the chain expects inputs
answer = chain.invoke(question)
signature = infer_signature(question, answer)

with mlflow.start_run(run_name="clippy_rag") as run:
    model_info = mlflow.langchain.log_model(
        chain,
        loader_fn=get_retriver,                     # your retriever factory
        artifact_path="chain",
        registered_model_name=REGISTERED_MODEL_NAME,
        # 2) Persist chain type so serving can reconstruct it
        model_config={"chain_type": "stuff"},
        # Pin requirements needed at serve time
        pip_requirements=[
            f"mlflow=={mlflow.__version__}",
            f"langchain=={langchain.__version__}",
            "databricks-vectorsearch",
        ],
        # 3) Keep non-DataFrame example intact for proper signature inference
        input_example=question,
        example_no_conversion=True,
        signature=signature,
    )

# Optional: quick pre-deployment validation
loaded = mlflow.langchain.load_model(model_info.model_uri)
_ = loaded.invoke(question)  # should run without errors

Why this works

  • The full Unity Catalog path ensures the version is created under UC and can be targeted by Model Serving without cross-registry confusion.
  • Providing model_config={"chain_type": "stuff"} writes the chain type into the MLflow LangChain flavor’s config (steps YAML), satisfying LangChain’s loader which otherwise throws “Must specify a chain Type in config.”
  • Doing a quick self-load/invoke avoids surprises at serving time and aligns with Databricks’ guidance to validate models pre-deployment.

Then serve it

You can now create a custom model serving endpoint from the UI (Serving > Create endpoint), selecting your UC model by its full name and version. The endpoint should transition to READY once the container image is built and the model is loaded.
 

Extra tips

  • If your endpoint shows “Not Ready” for an extended period, confirm the model version status in UC (READY vs. PENDING) and that the endpoint creator’s identity has UC access to the catalog/schema/model. If permissions are wrong for the creator, delete and recreate under a principal with correct UC privileges.
  • When logging nonstandard dependencies (private wheels or pinned versions), prefer logging them with the model (via pip_requirements, extra_pip_requirements, or conda_env) to ensure the serving container matches your training env.
  • If you want Databricks-managed authentication to resources (Vector Search, foundation model endpoints), consider the resources mechanism described in the agent logging docs; for simple retrievers your loader_fn is fine, but resources help with auth passthrough in production.
 
Cheers, Louis.