Databricks Community

gsalazar · ‎01-22-2025

Hi!

A lot similar to this 2021's post: https://community.databricks.com/t5/data-engineering/how-to-include-a-third-party-maven-package-in-m...

I'm attempting to serve a synapseml model (maven dependencies) using Databricks Model Serving Endpoints. On a general-purpose clusters i can add the libraries/packages without issue but when trying to deploy the serving endpoint i am unable to do so. Is it related to the conda.yml file? Do you know any workarounds?

Error Message: DEPLOYMENT_FAILED state: Failed to load the model. Exit code 1.

Best,

Gabriel

mark_ott · 4 weeks ago

You are encountering issues serving a SynapseML model (with Maven dependencies) via Databricks Model Serving Endpoints, and the deployment works fine on general-purpose clusters but fails for the serving endpoint. This is a well-known issue with Databricks Model Serving, where the endpoint environments do not allow installing Maven packages or Spark JARs in the same way as interactive clusters, leading to errors like "DEPLOYMENT_FAILED state: Failed to load the model. Exit code 1".

Why the Error Occurs

Databricks Model Serving endpoints currently do not support dynamic installation of Maven packages or Spark libraries. The environment for endpoints is limited to what is specified in the conda.yaml or requirements.txt files (Python dependencies), and cannot fetch JARs/maven artifacts at runtime.
Your interactive cluster can resolve and load Maven dependencies, but endpoints have a statically built environment, so Java/Scala dependencies aren't picked up unless they are present in the image.

Is it related to the conda.yml file?

Yes: While conda.yaml controls Python dependencies (and is critical for packages like pandas/numpy), it does not pull Maven or Java dependencies for model serving endpoints.
So, adding Maven coordinates or JARs to conda.yaml won't resolve your issue—the endpoint environment simply won't fetch them.

Community Experiences

Users have reported that models depending on Spark-based libraries (like SynapseML/LightGBM) register and work in MLflow and on general clusters, but fail to serve properly due to missing Java dependencies on endpoints.
Common errors include java.ClassNotFoundException for core SynapseML classes.

Workarounds and Recommendations

Log Only Pure Python Models for Endpoints: If you need to serve via Endpoint, ensure your model is finalized into a format that is fully Python-native and does not require Java/Spark runtime or external Maven dependencies.
Custom Container/Image Approach: For some organizations, Databricks supports custom images for serving. You may be able to bake the needed JARs into your serving environment if your Databricks workspace and subscription allow it (contact support for details).
Manual Dependency Packaging: As a temporary hack, some users extract required JARs into the model artifact path and attempt to programmatically add them in their MLflow model code, but this is often brittle and not officially supported.
Alternative Endpoints (General Purpose Clusters): For Spark/Java dependencies, it is recommended to deploy inference pipelines on normal Databricks clusters (not endpoints) and expose them via REST (Flask, FastAPI, etc.) on cluster jobs if real-time serving is needed.
MLflow PythonFunction Models with Explicit Dependency Management: If deploying via MLflow, log your model as a PythonFunction flavor and specify dependencies only for Python packages in your conda.yaml and requirements.txt. Java dependencies (Maven/JARs) should be present on your cluster/environment ahead of serving.
Check for Environment Binary Compatibility: In case you still get errors related to binaries (numpy, pandas, etc.), make sure versions are correctly pinned in your conda.yaml as some errors can manifest as deployment failures due to version mismatches.

Action Steps

Refactor your model so that inference can run entirely on Python with no Java/Spark runtime required, if endpoint serving is mandatory.
Otherwise, continue serving Spark/SynapseML models from general-purpose clusters where you can control Maven and JAR installation.

For up-to-date workarounds and official announcements, review the Databricks Machine Learning documentation and reach out to their support/community for feature requests regarding Model Serving and Maven dependencies.