cancel
Showing results for 
Search instead for 
Did you mean: 
Machine Learning
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithms, model training, deployment, and more. Connect with ML enthusiasts and experts.
cancel
Showing results for 
Search instead for 
Did you mean: 

Model Serving: An error occurred while loading the model: [JAVA_GATEWAY_EXITED] Java gateway process

mauricen
New Contributor

Hi! I have a custom model registered on unity catalog that I am able load and use for prediction. However, I am unable to deploy same model using the Model Serving UI. Databricks runtime used for model training and deployment is 15.4 ML.

Thanks in advance.

Code Snippet

# Define conda environment or pip requirements
conda_env = {
    'name': 'mlflow-env',
    'channels': ['defaults'],
    'dependencies': [
        'python=3.11.11',
        'pip',
        {
            'pip': [
                'pyspark==3.5.0',
                'mlflow==2.19.0'
            ]
        }
    ]
}

# set model alias
model_alias = 'macro_vars'

# Log model to MLflow
with mlflow.start_run(run_name=f"{model_alias}_run") as run:
    
    # Fit pipeline to training data
    pipeline_model = pipeline.fit(filtered_data)
    
    # Transform data using pipeline
    transformed_data = pipeline_model.transform(filtered_data)
    
    # Train logistic regression model
    lr_model = LogisticRegression(featuresCol='features', labelCol='CO_flag', maxIter=100)
    lr_model_fit = lr_model.fit(transformed_data)
    
    # Make predictions using trained model
    predictions = lr_model_fit.transform(transformed_data)
    
    # Log Model
    signature = infer_signature(transformed_data.select('features'), predictions.select('prediction'))
    
    mlflow.spark.log_model(
        spark_model=lr_model_fit,
        artifact_path=model_alias,
        signature=signature,
        conda_env=conda_env
    )
    
    # Register model
    catalog_name = "czcl"
    schema_name = "czcl_gold"
    model_name = "czcl_model"
    registered_name = f"{catalog_name}.{schema_name}.{model_name}"
    model_uri = f"runs:/{run.info.run_id}/{model_alias}"

    result = mlflow.register_model(model_uri, registered_name)
    client = mlflow.MlflowClient()
    client.set_registered_model_alias(name=registered_name, alias=model_alias, version=result.version)

Error Message:

[wgkkr] 2025-05-22 23:10:44.182 INFO : Initializing .........
[wgkkr] WARNING:root:mlflow-server
[wgkkr] [2025-05-22 23:10:44 +0000] [10] [INFO] Starting gunicorn 23.0.0
[wgkkr] [2025-05-22 23:10:44 +0000] [10] [INFO] Listening at: http://0.0.0.0:8080 (10)
[wgkkr] [2025-05-22 23:10:44 +0000] [10] [INFO] Using worker: sync
[wgkkr] [2025-05-22 23:10:44 +0000] [11] [INFO] Booting worker with pid: 11
[wgkkr] JAVA_HOME is not set
[wgkkr] [2025-05-22 23:10:50 +0000] An error occurred while loading the model: [JAVA_GATEWAY_EXITED] Java gateway process exited before sending its port number.
[wgkkr] [2025-05-22 23:10:50 +0000] Traceback (most recent call last):
[wgkkr] [2025-05-22 23:10:50 +0000]   File "/opt/conda/envs/mlflow-env/lib/python3.11/site-packages/mlflowserving/scoring_server/__init__.py", line 212, in get_model_option_or_exit
[wgkkr] [2025-05-22 23:10:50 +0000]     self.model = self.model_future.result()
[wgkkr] [2025-05-22 23:10:50 +0000]                  ^^^^^^^^^^^^^^^^^^^^^^^^^^
[wgkkr] [2025-05-22 23:10:50 +0000]   File "/opt/conda/envs/mlflow-env/lib/python3.11/concurrent/futures/_base.py", line 449, in result
[wgkkr] [2025-05-22 23:10:50 +0000]     return self.__get_result()
[wgkkr] [2025-05-22 23:10:50 +0000]            ^^^^^^^^^^^^^^^^^^^
[wgkkr] [2025-05-22 23:10:50 +0000]   File "/opt/conda/envs/mlflow-env/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
[wgkkr] [2025-05-22 23:10:50 +0000]     raise self._exception
[wgkkr] [2025-05-22 23:10:50 +0000]   File "/opt/conda/envs/mlflow-env/lib/python3.11/concurrent/futures/thread.py", line 58, in run
[wgkkr] [2025-05-22 23:10:50 +0000]     result = self.fn(*self.args, **self.kwargs)
[wgkkr] [2025-05-22 23:10:50 +0000]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[wgkkr] [2025-05-22 23:10:50 +0000]   File "/opt/conda/envs/mlflow-env/lib/python3.11/site-packages/mlflowserving/scoring_server/__init__.py", line 132, in _load_model_closure

 

2 REPLIES 2

Vidhi_Khaitan
Databricks Employee
Databricks Employee

Hi Mauricen,

[JAVA_GATEWAY_EXITED] Java gateway process exited before sending its port number.
...
[wgkkr] JAVA_HOME is not set


This suggests that the serving environment cannot spin up a JVM, which is required to run Spark models (like pyspark.ml.classification.LogisticRegressionModel) because:
Spark models need a JVM to work (they rely on the Spark engine).
Databricks Model Serving (serverless or classic) does not support Spark MLlib models.

You can switching to sklearn for the model type or use the model in batch inference pipelines only, not via Model Serving UI.

WiliamRosa
New Contributor II

You are encountering an error when trying to deploy a model via the Databricks Model Serving UI, even though the same model works correctly for predictions in a notebook. The error message includes:

[JAVA_GATEWAY_EXITED] Java gateway process exited before sending its port number
JAVA_HOME is not set

This issue occurs because your model was logged using mlflow.spark.log_model(), which creates a Spark MLlib model. These models depend on the Java Virtual Machine (JVM) to function. While this works fine in Databricks notebooks where Spark and Java are available by default, Databricks Model Serving runs models in isolated Python-only containers that do not include Java or Spark.

The failure is not caused by the worker type (such as "sync" or "gevent"), but by the fact that the serving environment lacks the necessary Java runtime and Spark context to load and serve a Spark model.

To resolve this issue, you have a few options:

Avoid logging Spark ML models if you intend to serve them through Databricks Model Serving. Instead, train your model using a Python-based library such as scikit-learn, XGBoost, LightGBM, or a custom MLflow pyfunc model. For example, you can convert your Spark DataFrame to a pandas DataFrame using .toPandas(), then train a scikit-learn model and log it using mlflow.sklearn.log_model() or mlflow.pyfunc.log_model().

If you must use Spark ML, you will need to perform inference in a notebook or via a Databricks job. These environments support Java and Spark and can load the model correctly. However, you will not be able to deploy it as a real-time REST endpoint using the Model Serving UI.

In a non-Databricks environment, you could manually deploy a model server that includes Java and Spark and configure the JAVA_HOME environment variable correctly. However, this approach is not supported within Databricks Model Serving.

In summary, Databricks Model Serving does not support serving models that require Spark or Java. You will need to use a Python-native model format to deploy your model as a REST API endpoint.

Wiliam Rosa
Data Engineer | Machine Learning Engineer
LinkedIn: linkedin.com/in/wiliamrosa

Join Us as a Local Community Builder!

Passionate about hosting events and connecting people? Help us grow a vibrant local community—sign up today to get started!

Sign Up Now