cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Run mlflow project from a Job.

Orianh
Valued Contributor II

Hey Guys,

I'm trying to make automated process to run ML training sessions using mlflow and databricks jobs.

While developing the model on my local machine using IDE, When finished I have a template notebook that get as parameters the mlflow project path and params.

While trying to run a job that will run this mlflow project i faced some issues and hope you will be able to help me.

Inside the training code ( e.g. main entry point ), I'm using set_experiment and start run with specific names for those run / experiment.

When trying to run this code as mlflow project using run api call, When not specified exp_name / run_name in the run api call im getting an error that i can't create an experiment from a job.

On the other hand When exp_name and run_name are specified within the run api call mlflow ignore set_experiment and start_run with the run name i wanted, Do you know if there is an option to enable creation of a experiments from a job? or way to overcome the need to specify exp_name and run name inside the run call?

After some tries i saw that mlflow create an experiment before the training code actually run, this is little problematic because if i need to specify the run name and the experiment name manually this process not gonna be to much automated ๐Ÿ˜…

Code example:

import mlflow
 
# This line throw an error, screen shot is attached.
mlflow.run ( dbutils.widgets.get('Project path), parameters=params)
 
# This line ignore any set_experiment / start_run(run_name='something') specifed in the code.
 
mlflow.run ( dbutils.widgets.get('Project path), parameters=params, experiment_name=dbutils.widgets.get('experiment_name'), run_name='test')
 

error

1 REPLY 1

kdatt
New Contributor II

Hi,

Were you able to figure out this one? I have same issue trying to call training notebook from workflow. Each run needs a new experiment name which I can do but then it creates a new experiment ID/name for each workflow run. Where as when you run from notebook directly, its same experiment ID. 

# Set the experiment
experiment_name = f"/Workspace/MLOps/{env}/experiment/{experiment}_{time.strftime('%Y-%m-%d_%H-%M-%S')}"
mlflow.set_experiment(experiment_name)
 
I was relying on ID to be same as I use that ID for inference.

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโ€™t want to miss the chance to attend and share knowledge.

If there isnโ€™t a group near you, start one and help create a community that brings people together.

Request a New Group