cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

How to proper use Databricks MLFlow Managed tracker/register with Databricks Workflow

GGG_P
New Contributor III

Hey.

I'm building a DevOps/MLOps pipeline to train/register simple scikit learn model.

I created a simple Databricks Workflow to execute training and register task on specific .git branch. (Workflow is setup with Databricks Repo on specifc branch, with Notebook as input).

FYI : Everything is working fine when I do run my notebook as standalone notebook, in my Workspace

During Databricks Workflow execution, I realize that I need to define my own 'experiment_name' (see error)

2022/12/08 04:36:32 WARNING mlflow.tracking.default_experiment.registry: Encountered unexpected error while getting experiment_id: FEATURE_DISABLED: Creation of experiments in jobs is not enabled. If using the Python fluent API, you can set an active experiment under which to create runs by calling mlflow.set_experiment("experiment_name") at the start of your program.
2022/12/08 04:36:32 WARNING mlflow.tracking.default_experiment.registry: Encountered unexpected error while getting experiment_id: None has type NoneType, but expected one of: bytes, unicode

So I did define set_tracking_uri with specific customer folder.

I did also create my experiment.

mlflow.set_tracking_uri("/my_custom_folder/")
run_id = mlflow.create_experiment("my_exp_from_databricks")

MLFlow is able to log everything ... BUT it's not managed by Databricks MLFlow anymore ...

I can't see anything from Databricks MLFlow UI.

I guess that my tracking_uri is wrong, but I have no idea what to set to be able to see it in Databricks MLFLow UI.

My question is simple, is it possible to run/log/register model using Databricks Managed MLFlow from Databricks Workflow ?

Thank you.

3 REPLIES 3

GGG_P
New Contributor III

It's working just by setting experiment on specific path

mlflow.set_experiment(f"/Users/${username}/my_exp")

BernardoC
New Contributor II

Nice contribution!

kdatt
New Contributor II

I had same issue while trying to call notebook from workflow. I was able to do what you did. But it needs new experiment name for each run, so I had to do this:

# Set the experiment
experiment_name = f"/Workspace/MLOps/{env}/experiment/{experiment}_{time.strftime('%Y-%m-%d_%H-%M-%S')}"
mlflow.set_experiment(experiment_name)
 
But this assigns a new experiment ID each run which doesnt work for me as I was hardcoding that ID for inference.
Not sure whats the best option here.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group