Hey @art1 , sorry this post got lost in the shuffle. Here are some things to consider regarding your question:
Thanks for flagging thisโwhat youโre seeing is expected given how Databricks integrates Hyperopt with MLflow, and there are clear ways to get your control back.
Whatโs causing the unexpected logging
- Hyperopt on Databricks uses its own automated MLflow tracking integration, which is independent of the standard
mlflow.autolog() feature. Thatโs why runs are still logged even if autologging is disabled at the workspace level or via mlflow.autolog(disable=True) in your notebook.
- Hyperopt is deprecated and scheduled for removal in the next major Databricks ML runtime, so steering toward Optuna or Ray Tune is recommended if you want tighter control over logging behavior going forward.
How to stop Hyperopt from auto-logging Pick one of these approaches:
-
Donโt use SparkTrials; use the default Trials class. Databricksโ automated MLflow logging for Hyperopt is tied to SparkTrials. When you run distributed or single-node trials with the default Trials, Databricks does not auto-log to MLflow and you control what gets logged manually.
from hyperopt import fmin, tpe, hp, Trials
def objective(params):
# your training + return a scalar loss
return loss
trials = Trials() # default Trials -> no Databricks auto-logging
best = fmin(fn=objective, space=search_space, algo=tpe.suggest, max_evals=50, trials=trials)
# If you want logging, do it explicitly:
# with mlflow.start_run():
# mlflow.log_params(best); mlflow.log_metric("final_loss", final_loss)
Databricks documents that with distributed training algorithms, use Trials (not SparkTrials) and manually call MLflow if you want logging.
-
Switch to Optuna or Ray Tune. Both integrate cleanly with MLflow and let you opt-in to logging via callbacks or explicit API calls, so nothing is logged unless you choose to.
-
If you must keep SparkTrials, redirect where it logs. You can set the active experiment (to a workspace experiment you own) or change the tracking URI to a path you control. This doesnโt disable logging, but it keeps it confined to the experiment you choose.
import mlflow
mlflow.set_tracking_uri("databricks") # default
mlflow.set_experiment("/Users/you@databricks.com/controlled-experiment") # a workspace experiment you created
Why you canโt delete the โexperimentโ in the UI
- Notebook experiments are special. When MLflow runs start without an active experiment, Databricks automatically creates a โnotebook experimentโ attached to the notebook. These cannot be renamed or deleted from the MLflow UI; the controls appear disabled because theyโre bound to the notebookโs lifecycle.
-
Deleting a notebook experiment via the API moves the notebook to Trash. If you use MlflowClient().delete_experiment(experiment_id) on a notebook experiment, Databricks will move the notebook itself to the Trash folder. Thatโs by design.
-
Experiments created from notebooks in Git folders have further limitations. You canโt directly manage rename/delete/permissions on those experiments; you must operate at the Git folder level.
Regain controlโpractical steps
- Set an explicit workspace experiment at the top of your notebook. That ensures runs never go to the auto-created notebook experiment and gives you full control of lifecycle in the UI.
import mlflow
mlflow.set_tracking_uri("databricks")
mlflow.set_experiment("/Users/you@databricks.com/my-experiment")
-
Avoid SparkTrials to stop Hyperopt auto-logging and wrap your training explicitly in mlflow.start_run() only where you want logs.
-
Consider Optuna or Ray Tune for HPO in 15.4 LTS ML and beyond; logging is opt-in and Hyperopt is being removed in the next major ML runtime.
-
If you need to remove the current notebook experiment: either
- Leave it and start logging to a workspace experiment as above, or
- Use the MLflow API to delete it (knowing the notebook will go to Trash).
Notes on workspace autologging
- Disabling Databricks Autologging affects framework autologgers (sklearn, PyTorch, XGBoost, etc.), but it does not affect the separate Hyperopt automated tracking integration, which is why it didnโt solve the issue.
Hope this helps, Louis.