<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Models failing in tutorial in Machine Learning</title>
    <link>https://community.databricks.com/t5/machine-learning/models-failing-in-tutorial/m-p/158830#M4625</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I am following the &lt;A href="http://Get%20started: Build your first machine learning model on Databricks" target="_blank" rel="noopener"&gt;"Get started: Build your first machine learning model on Databricks" tutorial&lt;/A&gt;, and am getting stuck on "&lt;A href="http://Parallel%20training using Optuna" target="_blank" rel="noopener"&gt;Parallel training using Optuna&lt;/A&gt;".&lt;/P&gt;&lt;P&gt;When I&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/getting-started/ml-get-started#search-runs-to-retrieve-the-best-model" target="_self"&gt;Search runs to retrieve the best model,&lt;/A&gt;&amp;nbsp;the following code fails as there are no models against the runs:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;best_model_pyfunc = mlflow.pyfunc.load_model(
  'runs:/{run_id}/model'.format(
    run_id=best_run.run_id
  )
)&lt;/LI-CODE&gt;&lt;P&gt;When I go to the runs in &lt;EM&gt;Experiments&lt;/EM&gt;, each model against the run says "Failed", with no other insight I can seem to find anywhere.&lt;/P&gt;&lt;P&gt;Why is the following code (direct from the full unaltered notebook provided by the tutorial) not successfully logging models against each run?&lt;/P&gt;&lt;P&gt;And how/where can I find out the cause behind this? I attempted to add logging inside the objective, and nothing showed in the cell. I know the code is running as each run logs test_auc against it, just not the model&lt;/P&gt;&lt;LI-CODE lang="python"&gt;def objective(trial):
  # Enable autologging on each worker
  mlflow.sklearn.autolog()
  with mlflow.start_run(nested=True):
    params = {
      'n_estimators': trial.suggest_int('n_estimators', 20, 1000),
      'learning_rate': trial.suggest_float('learning_rate', 0.05, 1.0, log=True),
      'max_depth': trial.suggest_int('max_depth', 2, 5),
    }
    model_hp = sklearn.ensemble.GradientBoostingClassifier(
      random_state=0,
      **params
    )
    model_hp.fit(X_train, y_train)
    predicted_probs = model_hp.predict_proba(X_test)
    # Tune based on the test AUC
    # In production, you could use a separate validation set instead
    roc_auc = sklearn.metrics.roc_auc_score(y_test, predicted_probs[:,1])
    mlflow.log_metric('test_auc', roc_auc)

    # Negate the AUC because Optuna minimizes the objective by default
    return -roc_auc


with mlflow.start_run(run_name='gb_optuna') as run:
  # Use the MLflow Tracking Server as the Optuna storage backend
  experiment_id = mlflow.active_run().info.experiment_id
  mlflow_storage = MlflowStorage(experiment_id=experiment_id)

  # MlflowSparkStudy distributes the tuning using Spark workers
  mlflow_study = MlflowSparkStudy(
    study_name="gb-optuna-tuning",
    storage=mlflow_storage,
  )

  mlflow_study.optimize(objective, n_trials=32, n_jobs=4)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 11 Jun 2026 20:19:45 GMT</pubDate>
    <dc:creator>appliable_ai</dc:creator>
    <dc:date>2026-06-11T20:19:45Z</dc:date>
    <item>
      <title>Models failing in tutorial</title>
      <link>https://community.databricks.com/t5/machine-learning/models-failing-in-tutorial/m-p/158830#M4625</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I am following the &lt;A href="http://Get%20started: Build your first machine learning model on Databricks" target="_blank" rel="noopener"&gt;"Get started: Build your first machine learning model on Databricks" tutorial&lt;/A&gt;, and am getting stuck on "&lt;A href="http://Parallel%20training using Optuna" target="_blank" rel="noopener"&gt;Parallel training using Optuna&lt;/A&gt;".&lt;/P&gt;&lt;P&gt;When I&amp;nbsp;&lt;A href="https://learn.microsoft.com/en-us/azure/databricks/getting-started/ml-get-started#search-runs-to-retrieve-the-best-model" target="_self"&gt;Search runs to retrieve the best model,&lt;/A&gt;&amp;nbsp;the following code fails as there are no models against the runs:&lt;/P&gt;&lt;LI-CODE lang="python"&gt;best_model_pyfunc = mlflow.pyfunc.load_model(
  'runs:/{run_id}/model'.format(
    run_id=best_run.run_id
  )
)&lt;/LI-CODE&gt;&lt;P&gt;When I go to the runs in &lt;EM&gt;Experiments&lt;/EM&gt;, each model against the run says "Failed", with no other insight I can seem to find anywhere.&lt;/P&gt;&lt;P&gt;Why is the following code (direct from the full unaltered notebook provided by the tutorial) not successfully logging models against each run?&lt;/P&gt;&lt;P&gt;And how/where can I find out the cause behind this? I attempted to add logging inside the objective, and nothing showed in the cell. I know the code is running as each run logs test_auc against it, just not the model&lt;/P&gt;&lt;LI-CODE lang="python"&gt;def objective(trial):
  # Enable autologging on each worker
  mlflow.sklearn.autolog()
  with mlflow.start_run(nested=True):
    params = {
      'n_estimators': trial.suggest_int('n_estimators', 20, 1000),
      'learning_rate': trial.suggest_float('learning_rate', 0.05, 1.0, log=True),
      'max_depth': trial.suggest_int('max_depth', 2, 5),
    }
    model_hp = sklearn.ensemble.GradientBoostingClassifier(
      random_state=0,
      **params
    )
    model_hp.fit(X_train, y_train)
    predicted_probs = model_hp.predict_proba(X_test)
    # Tune based on the test AUC
    # In production, you could use a separate validation set instead
    roc_auc = sklearn.metrics.roc_auc_score(y_test, predicted_probs[:,1])
    mlflow.log_metric('test_auc', roc_auc)

    # Negate the AUC because Optuna minimizes the objective by default
    return -roc_auc


with mlflow.start_run(run_name='gb_optuna') as run:
  # Use the MLflow Tracking Server as the Optuna storage backend
  experiment_id = mlflow.active_run().info.experiment_id
  mlflow_storage = MlflowStorage(experiment_id=experiment_id)

  # MlflowSparkStudy distributes the tuning using Spark workers
  mlflow_study = MlflowSparkStudy(
    study_name="gb-optuna-tuning",
    storage=mlflow_storage,
  )

  mlflow_study.optimize(objective, n_trials=32, n_jobs=4)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jun 2026 20:19:45 GMT</pubDate>
      <guid>https://community.databricks.com/t5/machine-learning/models-failing-in-tutorial/m-p/158830#M4625</guid>
      <dc:creator>appliable_ai</dc:creator>
      <dc:date>2026-06-11T20:19:45Z</dc:date>
    </item>
  </channel>
</rss>

