Run MLflow Projects on Azure Databricks

Anonymous
Not applicable

Hi,

I am trying to follow this simple document to be able to run MLFlow within Databricks:

https://docs.microsoft.com/en-us/azure/databricks/applications/mlflow/projects

I try to run it from:

  1. A Databricks notebook within Azure Databricks
  2. By use of the mlflow-cli (remote)
  3. By use of databricks-connect

I have tested that the 3 methods are properly set-up. I get the same error with all methods:

mlflow.exceptions.RestException: BAD_REQUEST: Unable to connect to the linked AzureML workspace. Check that the workspace exists.

The Databricks workspace is linked to an AzureML workspace.

By following this other document:

https://docs.microsoft.com/en-us/azure/machine-learning/how-to-use-mlflow-azure-databricks

I am actually able to create and run MLFlow experiments from:

  1. A Databricks notebook
  2. databricks-connect

The issue that I have is that:

  1. The experiments are only created in AzureML
  2. I can only run from within a script/notebook

If I follow this document:

https://docs.microsoft.com/en-us/azure/databricks/applications/mlflow/access-hosted-tracking-server

I am able to use the remote mlflow-cli and I can for example, create an experiment in databricks only (the experiment doesn't live in AzureML), by use of:

mlflow experiments create -n /Users/<your-username>/my-experiment

But again, when trying to do something like this:

mlflow run https://github.com/mlflow/mlflow#examples/sklearn_elasticnet_wine -b databricks --backend-config cluster-spec.json --experiment-id <experiment-id>

I get the error I previously mentioned:

mlflow.exceptions.RestException: BAD_REQUEST: Unable to connect to the linked AzureML workspace. Check that the workspace exists.

I have set

export MLFLOW_TRACKING_URI=databricks

And everything else as noted in the documentation.

Is there a configuration I am missing?

Thanks a lot in advance for any help!

Hubert-Dudek
Databricks MVP

Maybe this answer will help:

https://community.databricks.com/s/question/0D53f00001UOu7rCAD/mlflow-resourcealreadyexists

as @Prabakar Ammeappin​ wrote " it’s not recommended to “link” the Databricks and AML workspaces, as we are seeing more problems"


My blog: https://databrickster.medium.com/

View solution in original post

Anonymous
Not applicable

Hi,

Thanks a lot for the help! This linking within WS was already set by somebody working in the team. I will investigate the reason why and try to unlink them. I will report back and say if this suits the trick as the error is not​ so well documented and it might help others.

Anonymous
Not applicable

@Arturo Amador​ - That would be great! Once you share your solution, would you be happy to mark your answer as best so others can find it more easily?

Hi @Hubert Dudek​ Thanks for sharing the post.

you are welcome 🙂


My blog: https://databrickster.medium.com/

Anonymous
Not applicable

Hi! Thanks a lot for your help. I took contact with tech support via my Azure subscription. They confirmed that indeed, connecting the Databricks and AzureML workspaces introduces errors. They provided us with a ARM template for removing the connection. After the connection was successfully removed, there were no more problems with tracking experiments with MLFlow directly in the Databricks WS.