cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
cancel
Showing results for 
Search instead for 
Did you mean: 

MLflow Project run always comes back as status failed.

confusedIntern
New Contributor III

Hi! This is kind of an urgent question so any help would be greatly appreciated! Thanks so much!

So I'm following this tutorial to try to create an MLflow project: https://docs.databricks.com/applications/mlflow/projects.html

I tried with the example in the tutorial by following into the GitHub repo and downloading the files into my DBFS and creating a cluster-spec json.

This is the code I used:

Screen Shot 2022-07-13 at 10.09.07 AMBut this is the result I get back:

Screen Shot 2022-07-13 at 10.13.48 AM 

I don't understand why I'm getting back status failure when I'm using the same code from the tutorial. And when I go see the experiment, its says there are no artifacts:

Screen Shot 2022-07-13 at 10.17.26 AM 

Please help! Thank you so much!

7 REPLIES 7

Kaniz
Community Manager
Community Manager

Hi @Margie Kale​, You must use a new cluster specification when running an MLflow Project on Databricks. Running Projects against existing clusters is not supported.

Check your cluster specifications once and try again.

confusedIntern
New Contributor III

Hi @Kaniz Fatma​! By new cluster specification, do you mean creating a new cluster using a json file? In the tutorial, this code was what we needed to run in the notebook:

mlflow run <uri> -b databricks --backend-config <json-new-cluster-spec>

so for the <json-new-cluster-spec>, what I have for that code is this: Screen Shot 2022-07-15 at 8.23.13 AMWould this be the new cluster specification?

(Also, the company's databricks I'm using has internet restriction, and I'm not sure if that would factor into this error, but if it does, I would love to have an explanation!)

Thank you so much!!

confusedIntern
New Contributor III

Hi @Alex Barreto​ and @Kaniz Fatma​ 

I did specify the cluster specifications. It's the .json file from the one line of code. That json file was also uploaded into DBFS for me to use.

I'm still confused as to why this is not working. I have a feeling it might be because my company restricts access to the internet and the project is reaching out to the internet? If so, how do I make it work so it doesn't reach the internet and could work?

Thank you so much!

Prabakar
Esteemed Contributor III
Esteemed Contributor III

hi @Margie Kale​ you have the logs for the failed cluster. Check it to have a better understanding of what went wrong.

Vidula
Honored Contributor

Hey there @Margie Kale​ 

Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. 

We'd love to hear from you.

Thanks!

sean_owen
Honored Contributor II
Honored Contributor II

This is generally not how you use MLflow in Databricks. You are already in Databricks so do not need to send code to Databricks to execute. Instead just run your code in a notebook; there is no need to package as an MLflow Project. Projects are primarily for use outside of Databricks, though you can send them to Databricks to execute - though you'd do that from a CLI from elsewhere.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.