When does everyone utilize the model register?

Yuki — Fri, 18 Apr 2025 05:37:47 GMT

Hi, I'm Yuki,

I'm considering when I should use register_model.

In my case, I'm running the training batch once a week and if the model is good, I want to update the champion.

I have created the code to register the model if the score is the best.

# start run with mlflow.start_run(): clf = RandomForestRegressor(n_estimators=100) clf.fit(X, y) # log model to the run mlflow.sklearn.log_model(sk_model=clf) # search best model of all runs best_run = mlflow.search_runs([experiment_id], order_by=["metric.test_f1 DESC"]).iloc[0] # register model result = mlflow.register_model(f"runs:/{best_run.run_id}/model", model_name) # create "Champion" alias for the best version client = MlflowClient() client.set_registered_model_alias("prod.ml_team.iris_model", "Champion", result.version)

This can avoid registering many models in the model registry, keeping it clean.

But I feel that the code is not perfect and seems strange.

First, in the documents, we can register the model every time easily and smoothly like below.

mlflow.sklearn.log_model( sk_model=clf, artifact_path="model", # The signature is automatically inferred from the input example and its predicted output. input_example=input_example, registered_model_name="prod.ml_team.iris_model", )

When is the case to use?

Second, my code may create duplicates of models or source runs. Of course, I can check for duplicates before registering the model, but I feel my lack of knowledge. If I want to achieve the purpose, I can use mlflow.search_runs([experiment_id], order_by=["metric.test_f1 DESC"]).iloc[0] every time and no need for registration.

I don't grasp the core idea of model registry.

How does everyone do that?

Re: When does everyone utilize the model register?

Kumaran — Thu, 21 Aug 2025 19:52:39 GMT

Hi @Yuki,

Thank you for contacting the Databricks community.

If you run register_model with the same run twice, you’ll create multiple versions pointing to the same source.
To avoid that, you can check if the run is already registered before creating a new version:

from mlflow.tracking import MlflowClient client = MlflowClient() existing = [ v.source for v in client.search_model_versions(f"name='{model_name}'") ] if f"runs:/{best_run.run_id}/model" not in existing: result = mlflow.register_model(f"runs:/{best_run.run_id}/model", model_name)

topic Re: When does everyone utilize the model register? in Machine Learning

When does everyone utilize the model register?

Re: When does everyone utilize the model register?