โ12-19-2024 09:01 AM
I want to get the LightGBM built-in variable importance values from a model that was generated by AutoML. That's not logged in the metrics by default - can I change a setting so that it will be logged?
More fundamentally: what I'd really like is to modify the LightGBM notebook generated by AutoML so that it logs the model to MLflow with flavor.loader_module equal to mlflow.lightgbm. By default, it logs with that parameter equal to mlflow.sklearn.
As a result, if I want to load that model back in for new predictions, I have to use model = mlflow.pyfunc.load_model(), and that loads the model in a generic format that doesn't include useful LightGBM things like model.feature_importances.
I want to be able to load the model back in via model = mlflow.lightgbm.load_model(), but currently that generates an error because the model wasn't saved in the right format for that.
Now, I know model = mlflow.lightgbm.load_model() succeeds on a different model that I originally saved in LightGBM flavor via mlflow.lightgbm.log_model(). But the AutoML notebook doesn't use load_model(), so I have to look further for a way to force LightGBM flavor.
In that vein, I did find a command pyfunc.add_to_model(mlflow_model, loader_module="mlflow.sklearn") in the notebook. Sadly, changing it to loader_module="mlflow.lightgbm" had no discernable effect on the problem. The model saved to MLflow still had flavor.loader_module equal to mlflow.sklearn.
โ12-31-2024 05:31 AM
To address your concerns about logging LightGBM feature importance and modifying the AutoML-generated LightGBM model to use the mlflow.lightgbm flavor, you'll need to make some changes to the AutoML notebook. Here's an approach to achieve what you're looking for:
Logging Feature Importance
LightGBM's feature importance is not logged by default in MLflow's autologging. To log this information, you can manually add it to the MLflow run after the model is trained. Here's how you can do this:
import mlflow
import lightgbm as lgb
# Assuming 'model' is your trained LightGBM model
feature_importance = model.feature_importance(importance_type='gain')
feature_names = model.feature_name()
# Log feature importance
for feature, importance in zip(feature_names, feature_importance):
mlflow.log_metric(f"feature_importance_{feature}", importance)
Changing Model Flavor to LightGBM
To log the model with the LightGBM flavor instead of scikit-learn, you need to modify the model logging process in the AutoML notebook. Here's how you can do it:
import mlflow.lightgbm
# Assuming 'model' is your trained LightGBM model
mlflow.lightgbm.log_model(model, "model", registered_model_name="your_model_name")
lightgbm_model = pipeline.named_steps['lightgbm']
mlflow.lightgbm.log_model(lightgbm_model, "lightgbm_model")
Loading the Model
After making these changes, you should be able to load the model using:
loaded_model = mlflow.lightgbm.load_model("runs:/your_run_id/lightgbm_model")
This loaded model will have access to LightGBM-specific attributes like feature_importances_.
โ12-31-2024 05:31 AM
To address your concerns about logging LightGBM feature importance and modifying the AutoML-generated LightGBM model to use the mlflow.lightgbm flavor, you'll need to make some changes to the AutoML notebook. Here's an approach to achieve what you're looking for:
Logging Feature Importance
LightGBM's feature importance is not logged by default in MLflow's autologging. To log this information, you can manually add it to the MLflow run after the model is trained. Here's how you can do this:
import mlflow
import lightgbm as lgb
# Assuming 'model' is your trained LightGBM model
feature_importance = model.feature_importance(importance_type='gain')
feature_names = model.feature_name()
# Log feature importance
for feature, importance in zip(feature_names, feature_importance):
mlflow.log_metric(f"feature_importance_{feature}", importance)
Changing Model Flavor to LightGBM
To log the model with the LightGBM flavor instead of scikit-learn, you need to modify the model logging process in the AutoML notebook. Here's how you can do it:
import mlflow.lightgbm
# Assuming 'model' is your trained LightGBM model
mlflow.lightgbm.log_model(model, "model", registered_model_name="your_model_name")
lightgbm_model = pipeline.named_steps['lightgbm']
mlflow.lightgbm.log_model(lightgbm_model, "lightgbm_model")
Loading the Model
After making these changes, you should be able to load the model using:
loaded_model = mlflow.lightgbm.load_model("runs:/your_run_id/lightgbm_model")
This loaded model will have access to LightGBM-specific attributes like feature_importances_.
โ12-31-2024 05:31 AM
pyfunc.add_to_model()
function you mentioned is used to add the Python Function flavor to the model, which is different from changing the primary flavor of the logged model. That's why changing its parameter didn't solve the issue.Passionate about hosting events and connecting people? Help us grow a vibrant local communityโsign up today to get started!
Sign Up Now