<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Save model from AutoML to MLflow in LightGBM flavor in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/save-model-from-automl-to-mlflow-in-lightgbm-flavor/m-p/103685#M3880</link>
    <description>&lt;H2 class="mb-2 mt-6 text-lg first:mt-3"&gt;Additional Considerations&lt;/H2&gt;
&lt;UL class="marker:text-textOff list-disc pl-8"&gt;
&lt;LI&gt;&lt;SPAN&gt;The&amp;nbsp;&lt;CODE&gt;pyfunc.add_to_model()&lt;/CODE&gt;&amp;nbsp;function you mentioned is used to add the Python Function flavor to the model, which is different from changing the primary flavor of the logged model. That's why changing its parameter didn't solve the issue.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;If you need to maintain compatibility with the existing AutoML pipeline, you might consider logging the model twice: once with the scikit-learn flavor for the pipeline, and once with the LightGBM flavor for accessing LightGBM-specific features.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;Remember to test these changes thoroughly, as they may affect how the model is used in production environments that expect the scikit-learn flavor.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
    <pubDate>Tue, 31 Dec 2024 13:31:55 GMT</pubDate>
    <dc:creator>Alberto_Umana</dc:creator>
    <dc:date>2024-12-31T13:31:55Z</dc:date>
    <item>
      <title>Save model from AutoML to MLflow in LightGBM flavor</title>
      <link>https://community.databricks.com/t5/machine-learning/save-model-from-automl-to-mlflow-in-lightgbm-flavor/m-p/102701#M3870</link>
      <description>&lt;P&gt;I want to get the LightGBM built-in variable importance values from a model that was generated by AutoML.&amp;nbsp; That's not logged in the metrics by default - can I change a setting so that it will be logged?&lt;/P&gt;&lt;P&gt;More fundamentally:&amp;nbsp; what I'd really like is to modify the LightGBM notebook generated by AutoML so that it logs the model to MLflow with &lt;FONT face="courier new,courier" color="#808080"&gt;flavor.loader_module&lt;/FONT&gt;&amp;nbsp;equal to &lt;FONT face="courier new,courier" color="#808080"&gt;mlflow.lightgbm&lt;/FONT&gt;.&amp;nbsp; By default, it logs with that parameter equal to &lt;FONT face="courier new,courier" color="#808080"&gt;mlflow.sklearn&lt;/FONT&gt;.&lt;/P&gt;&lt;P&gt;As a result, if I want to load that model back in for new predictions, I have to use &lt;FONT face="courier new,courier" color="#808080"&gt;model =&amp;nbsp;&lt;SPAN&gt;mlflow.pyfunc.&lt;/SPAN&gt;&lt;SPAN&gt;load_model&lt;/SPAN&gt;&lt;/FONT&gt;&lt;SPAN&gt;&lt;FONT face="courier new,courier" color="#808080"&gt;()&lt;/FONT&gt;, and that loads the model in a generic format that doesn't include useful LightGBM things like &lt;FONT face="courier new,courier" color="#999999"&gt;model.feature_importances&lt;/FONT&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;I want to be able to load the model back in via &lt;FONT face="courier new,courier" color="#999999"&gt;model =&amp;nbsp;&lt;SPAN&gt;mlflow.lightgbm.&lt;/SPAN&gt;&lt;SPAN&gt;load_model&lt;/SPAN&gt;&lt;/FONT&gt;&lt;SPAN&gt;&lt;FONT face="courier new,courier" color="#999999"&gt;()&lt;/FONT&gt;, but currently that generates an error because the model wasn't saved in the right format for that.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Now, I know &lt;FONT face="courier new,courier" color="#999999"&gt;model =&amp;nbsp;&lt;SPAN&gt;mlflow.lightgbm.&lt;/SPAN&gt;&lt;SPAN&gt;load_model&lt;/SPAN&gt;&lt;/FONT&gt;&lt;SPAN&gt;&lt;FONT face="courier new,courier" color="#999999"&gt;()&lt;/FONT&gt;&amp;nbsp;succeeds on a different model that I originally saved in LightGBM flavor via &lt;/SPAN&gt;&lt;FONT face="courier new,courier" color="#999999"&gt;&lt;SPAN&gt;mlflow.lightgbm.&lt;/SPAN&gt;&lt;SPAN&gt;log_model&lt;/SPAN&gt;&lt;/FONT&gt;&lt;SPAN&gt;&lt;FONT face="courier new,courier" color="#999999"&gt;()&lt;/FONT&gt;.&amp;nbsp; But the AutoML notebook doesn't use &lt;FONT face="courier new,courier" color="#999999"&gt;load_model()&lt;/FONT&gt;, so I have to look further for a way to force LightGBM flavor.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;In that vein, I did find a command &lt;FONT face="courier new,courier" color="#999999"&gt;&lt;SPAN&gt;pyfunc.&lt;/SPAN&gt;&lt;SPAN&gt;add_to_model&lt;/SPAN&gt;&lt;SPAN&gt;(mlflow_model, &lt;/SPAN&gt;&lt;SPAN&gt;loader_module&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;SPAN&gt;"mlflow.sklearn"&lt;/SPAN&gt;&lt;/FONT&gt;&lt;SPAN&gt;&lt;FONT face="courier new,courier" color="#999999"&gt;)&lt;/FONT&gt;&amp;nbsp;in the notebook.&amp;nbsp; Sadly, changing it to &lt;FONT face="courier new,courier" color="#999999"&gt;&lt;SPAN&gt;loader_module&lt;/SPAN&gt;&lt;SPAN&gt;=&lt;/SPAN&gt;&lt;/FONT&gt;&lt;SPAN&gt;&lt;FONT face="courier new,courier" color="#999999"&gt;"mlflow.lightgbm"&lt;/FONT&gt;&amp;nbsp;had no discernable effect on the problem.&amp;nbsp; The model saved to MLflow still had &lt;FONT face="courier new,courier" color="#999999"&gt;flavor.loader_module&lt;/FONT&gt;&amp;nbsp;equal to &lt;FONT face="courier new,courier" color="#999999"&gt;mlflow.sklearn&lt;/FONT&gt;.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 19 Dec 2024 17:01:19 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/save-model-from-automl-to-mlflow-in-lightgbm-flavor/m-p/102701#M3870</guid>
      <dc:creator>dkxxx-rc</dc:creator>
      <dc:date>2024-12-19T17:01:19Z</dc:date>
    </item>
    <item>
      <title>Re: Save model from AutoML to MLflow in LightGBM flavor</title>
      <link>https://community.databricks.com/t5/machine-learning/save-model-from-automl-to-mlflow-in-lightgbm-flavor/m-p/103684#M3879</link>
      <description>&lt;P class="p1"&gt;To address your concerns about logging LightGBM feature importance and modifying the AutoML-generated LightGBM model to use the&amp;nbsp;&lt;STRONG&gt;mlflow.lightgbm&lt;/STRONG&gt;&amp;nbsp;flavor, you'll need to make some changes to the AutoML notebook. Here's an approach to achieve what you're looking for:&lt;/P&gt;
&lt;P class="p1"&gt;Logging Feature Importance&lt;/P&gt;
&lt;P class="p1"&gt;LightGBM's feature importance is not logged by default in MLflow's autologging. To log this information, you can manually add it to the MLflow run after the model is trained. Here's how you can do this:&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;import mlflow&lt;/P&gt;
&lt;P class="p1"&gt;import lightgbm as lgb&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;# Assuming 'model' is your trained LightGBM model&lt;/P&gt;
&lt;P class="p1"&gt;feature_importance = model.feature_importance(importance_type='gain')&lt;/P&gt;
&lt;P class="p1"&gt;feature_names = model.feature_name()&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;# Log feature importance&lt;/P&gt;
&lt;P class="p1"&gt;for feature, importance in zip(feature_names, feature_importance):&lt;/P&gt;
&lt;P class="p1"&gt;&lt;SPAN class="Apple-converted-space"&gt;&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;mlflow.log_metric(f"feature_importance_{feature}", importance)&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p3"&gt;&lt;STRONG&gt;Changing Model Flavor to LightGBM&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="p1"&gt;To log the model with the LightGBM flavor instead of scikit-learn, you need to modify the model logging process in the AutoML notebook. Here's how you can do it:&lt;/P&gt;
&lt;OL class="ol1"&gt;
&lt;LI class="li1"&gt;Find the part of the notebook where the model is being logged to MLflow.&lt;/LI&gt;
&lt;LI class="li1"&gt;Replace the existing logging code with&amp;nbsp;&lt;STRONG&gt;mlflow.lightgbm.log_model()&lt;/STRONG&gt;. Here's an example:&lt;/LI&gt;
&lt;/OL&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;import mlflow.lightgbm&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;# Assuming 'model' is your trained LightGBM model&lt;/P&gt;
&lt;P class="p1"&gt;mlflow.lightgbm.log_model(model, "model", registered_model_name="your_model_name")&lt;/P&gt;
&lt;P class="p4"&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL class="ol1"&gt;
&lt;LI class="li1"&gt;If you're using a pipeline that includes preprocessing steps, you'll need to log the LightGBM model separately from the pipeline. You can do this by extracting the LightGBM model from the pipeline:&lt;/LI&gt;
&lt;/OL&gt;
&lt;P class="p5"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;lightgbm_model = pipeline.named_steps['lightgbm']&lt;/P&gt;
&lt;P class="p1"&gt;mlflow.lightgbm.log_model(lightgbm_model, "lightgbm_model")&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL class="ol1"&gt;
&lt;LI class="li1"&gt;You may also need to log the preprocessing steps separately if they're required for making predictions.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;Loading the Model&lt;/P&gt;
&lt;P class="p1"&gt;After making these changes, you should be able to load the model using:&lt;/P&gt;
&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1"&gt;loaded_model = mlflow.lightgbm.load_model("runs:/your_run_id/lightgbm_model")&lt;/P&gt;
&lt;P class="p1"&gt;This loaded model will have access to LightGBM-specific attributes like&amp;nbsp;&lt;STRONG&gt;feature_importances_&lt;/STRONG&gt;.&lt;/P&gt;</description>
      <pubDate>Tue, 31 Dec 2024 13:31:29 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/save-model-from-automl-to-mlflow-in-lightgbm-flavor/m-p/103684#M3879</guid>
      <dc:creator>Alberto_Umana</dc:creator>
      <dc:date>2024-12-31T13:31:29Z</dc:date>
    </item>
    <item>
      <title>Re: Save model from AutoML to MLflow in LightGBM flavor</title>
      <link>https://community.databricks.com/t5/machine-learning/save-model-from-automl-to-mlflow-in-lightgbm-flavor/m-p/103685#M3880</link>
      <description>&lt;H2 class="mb-2 mt-6 text-lg first:mt-3"&gt;Additional Considerations&lt;/H2&gt;
&lt;UL class="marker:text-textOff list-disc pl-8"&gt;
&lt;LI&gt;&lt;SPAN&gt;The&amp;nbsp;&lt;CODE&gt;pyfunc.add_to_model()&lt;/CODE&gt;&amp;nbsp;function you mentioned is used to add the Python Function flavor to the model, which is different from changing the primary flavor of the logged model. That's why changing its parameter didn't solve the issue.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;If you need to maintain compatibility with the existing AutoML pipeline, you might consider logging the model twice: once with the scikit-learn flavor for the pipeline, and once with the LightGBM flavor for accessing LightGBM-specific features.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;LI&gt;&lt;SPAN&gt;Remember to test these changes thoroughly, as they may affect how the model is used in production environments that expect the scikit-learn flavor.&lt;/SPAN&gt;&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Tue, 31 Dec 2024 13:31:55 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/save-model-from-automl-to-mlflow-in-lightgbm-flavor/m-p/103685#M3880</guid>
      <dc:creator>Alberto_Umana</dc:creator>
      <dc:date>2024-12-31T13:31:55Z</dc:date>
    </item>
  </channel>
</rss>

