You are encountering an error when trying to deploy a model via the Databricks Model Serving UI, even though the same model works correctly for predictions in a notebook. The error message includes:
[JAVA_GATEWAY_EXITED] Java gateway process exited before sending its port number
JAVA_HOME is not set
This issue occurs because your model was logged using mlflow.spark.log_model(), which creates a Spark MLlib model. These models depend on the Java Virtual Machine (JVM) to function. While this works fine in Databricks notebooks where Spark and Java are available by default, Databricks Model Serving runs models in isolated Python-only containers that do not include Java or Spark.
The failure is not caused by the worker type (such as "sync" or "gevent"), but by the fact that the serving environment lacks the necessary Java runtime and Spark context to load and serve a Spark model.
To resolve this issue, you have a few options:
Avoid logging Spark ML models if you intend to serve them through Databricks Model Serving. Instead, train your model using a Python-based library such as scikit-learn, XGBoost, LightGBM, or a custom MLflow pyfunc model. For example, you can convert your Spark DataFrame to a pandas DataFrame using .toPandas(), then train a scikit-learn model and log it using mlflow.sklearn.log_model() or mlflow.pyfunc.log_model().
If you must use Spark ML, you will need to perform inference in a notebook or via a Databricks job. These environments support Java and Spark and can load the model correctly. However, you will not be able to deploy it as a real-time REST endpoint using the Model Serving UI.
In a non-Databricks environment, you could manually deploy a model server that includes Java and Spark and configure the JAVA_HOME environment variable correctly. However, this approach is not supported within Databricks Model Serving.
In summary, Databricks Model Serving does not support serving models that require Spark or Java. You will need to use a Python-native model format to deploy your model as a REST API endpoint.
Wiliam Rosa
Data Engineer | Machine Learning Engineer
LinkedIn: linkedin.com/in/wiliamrosa