Databricks Community

Celia · ‎07-29-2021

We try to use MLflow Model Serving, this service will enable realtime model serving behind a REST API interface; it will launch a single-node cluster that will host our model.

The issue happens when the single-node cluster try to get the environment ready base on a conda.yaml file that created when log the model using MLflow. But it looks like I can only specify a pip install but not a Maven package.

conda_env = _mlflow_conda_env(

additional_conda_deps=None,

additional_pip_deps=["cloudpickle=={}".format(cloudpickle.version), "scikit-learn=={}".format(sklearn.version),"pyspark==3.0.0".format(pyspark.version))],

additional_conda_channels=None,

)

how can I tell the cluster to install a maven jar file?

sean_owen · ‎09-01-2021

I don't believe you can do that at the moment. Is it required for a Python model? only Python-based models can really be served this way at the moment

BeardyMan · ‎09-14-2021

Unfortunately we came across this same issue. We were trying to use MLFlow Serve to produce an API that could take text input and pass it through some NLP. In this instance we had installed a maven package on the cluster, so the experiment would run fine in a notebook, but MLFlow would fail as it couldn't install the maven package. As an alternative, it would help to be able to modify the job cluster that is provisioned to add additional libraries/packages that are required, that we can not specify in the conda definition.

Databricks Community

how to include a third-party Maven package in MLflow model serving job cluster in Azure Databricks

Photos

Join Us as a Local Community Builder!

Announcing the APJ Databricks Smart Business Insights Challenge: Empowering Data-Driven Decision Mak

🚀 Monthly Databricks Get Started Days – Accelerate Your Learning Journey! 🚀

Business Intelligence in the Era of AI

Virtual Learning Festival: 9 April - 30 April

Data + AI Summit 2025 — registration now open!