Databricks Community

NaeemS · ‎06-20-2024

Hi,

I am trying to deploy my model which was logged by featureStoreEngineering client as a serving endpoint in Databricks. But I am facing following error:

 The Databricks Lookup client from databricks-feature-lookup and Databricks Feature Store client from databricks-feature-engineering cannot be installed in the same python environment.

When I log my model using FeatureStoreEngineeringClient databricks-feature-lookup is added by default in my requirements file, and when I load that model in other environment I get this error. Also, when I add databricks-feature-engineering ad dependency I get the same error in my model serving endpoint creation. Also, if I don't add databricks-feature-engineering as a dependency I will get other errors while creation of my serving endpoint.

model_impl = importlib.import_module(conf[MAIN])._load_pyfunc(data_path) [cc8d72wmj6] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflow/spark/__init__.py", line 898, in _load_pyfunc [cc8d72wmj6] spark = _create_local_spark_session_for_loading_spark_model() [cc8d72wmj6] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflow/utils/_spark_utils.py", line 131, in _create_local_spark_session_for_loading_spark_model [cc8d72wmj6] .getOrCreate() [cc8d72wmj6] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/pyspark/sql/session.py", line 497, in getOrCreate [cc8d72wmj6] sc = SparkContext.getOrCreate(sparkConf) [cc8d72wmj6] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/pyspark/context.py", line 515, in getOrCreate [cc8d72wmj6] SparkContext(conf=conf or SparkConf()) [cc8d72wmj6] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/pyspark/context.py", line 201, in __init__ [cc8d72wmj6] SparkContext._ensure_initialized(self, gateway=gateway, conf=conf) [cc8d72wmj6] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/pyspark/context.py", line 436, in _ensure_initialized [cc8d72wmj6] SparkContext._gateway = gateway or launch_gateway(conf) [cc8d72wmj6] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/pyspark/java_gateway.py", line 107, in launch_gateway [cc8d72wmj6] raise PySparkRuntimeError( [cc8d72wmj6] pyspark.errors.exceptions.base.PySparkRuntimeError: [JAVA_GATEWAY_EXITED] Java gateway process exited before sending its port number. [cc8d72wmj6] [2024-06-20 21:04:06 +0000] [10] [ERROR] Error handling request /v2/health/ready I would appreciate any help in solving this issue.

Thanks,

NaeemS · ‎06-21-2024

Hi @Retired_mod ,

Thanks for your response. But I have limited access to shared cluster and can not use it for this purpose. Can you tell me how I can prevent databricks-feature-lookup from being added in my requirements file when I am logging my model using FeatureStoreEngineeringClient which is causing this issue.

Or any other workaround for solving this issue with single user cluster.

robbe · ‎06-23-2024

Hi @NaeemS , but the issue should not arise from having databricks-feature-lookup in your serving endpoint, quite the opposite. You should have the databricks-feature-lookup dependency but not the databricks-feature-engineering dependency.

Can you please elaborate more on the error that you mentioned in the OP, perhaps including some reproducible code?

NaeemS · ‎06-24-2024

Hi @robbe ,
I am facing following errors when creating serving endpoint for my spark pipeline

5548dsptvc] raise self._exception
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflowserving/scoring_server/__init__.py", line 194, in get_model_option_or_exit
[5548dsptvc] self.model = self.model_future.result()
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 451, in result
[5548dsptvc] return self.__get_result()
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
[5548dsptvc] raise self._exception
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflowserving/scoring_server/__init__.py", line 194, in get_model_option_or_exit
[5548dsptvc] self.model = self.model_future.result()
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 451, in result
[5548dsptvc] return self.__get_result()
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
[5548dsptvc] raise self._exception
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflowserving/scoring_server/__init__.py", line 194, in get_model_option_or_exit
[5548dsptvc] self.model = self.model_future.result()
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 451, in result
[5548dsptvc] return self.__get_result()

I have tested creating a sklearn model in my environment which is working fine, but for the spark pipeline I am unable to do so , following is my requirement file which is being logged with my model.

mlflow==2.14.1
numpy==1.21.5
pyspark==3.5.1
scipy==1.9.1
databricks-feature-lookup==0.*

databricks-feature-lookup 1.2.14 is being installed in my environment as the earlier versions has been deprecated.

Thanks,

robbe · ‎06-25-2024

Hi @NaeemS, it's hard to say given how uninformative the error is. I will try to give it a go next week but maybe you can help me by answering a few questions:

Can you paste the exact line of code that trigger the error?
Does the sklearn model also use the Feature Store? Is the only difference in the library used (pyspark vs sklearn)?
Is your Online Feature Store correctly configured with the right credentials to retrieve the latest feature set?
What happens if you remove the
```
databricks-feature-lookup==0.*
```
from your requirements file?

NaeemS · ‎06-26-2024

Hi @robbe ,

My model was logged successfully using feature engineering client. The issue appears while creating a serving endpoint for that model.
For the Sklearn model I used a simple Random Forest model and for the spark model I am logging a pipeline with multiple steps like imputer, indexer, assembler and model.
Also, I am not using 3rd part feature stores here. I am using online tables within Unity Catalog, this is relatively a new feature introduced by Databricks. I am using same store with sklearn model and it is working fine, the issue appears only with spark pipeline.

While logging my model I am not specifying databricks-feature-lookup as dependency, it is being added by default with my model even if I give my own requirements file while logging the model.

Thanks

robbe · ‎07-22-2024

Hi @NaeemS, did you manage to find a fix? I tried to run the same setting that you have and Iam running into your same problem and I think I got the reason - JAVA_HOME is not set.

So the Feature Engineering library seems to be messing up with the Java installation. I'll try to look into the issue further.

damselfly20 · ‎07-23-2024

Hi @robbe, I'm facing the same error like @NaeemS. I've deployed an endpoint for a RAG chain in Azure Databricks and at first, it worked well. I've set scale_to_zero_enabled=True. The problem is: Sometimes, scaling up from zero works fine and sometimes it results in an error:

[b2rtc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
[b2rtc] raise self._exception
[b2rtc] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflowserving/scoring_server/__init__.py", line 182, in get_model_option_or_exit
[b2rtc] self.model = self.model_future.result()
[b2rtc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 451, in result
[b2rtc] return self.__get_result()
[b2rtc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
[b2rtc] raise self._exception
[b2rtc] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflowserving/scoring_server/__init__.py", line 182, in get_model_option_or_exit
[b2rtc] self.model = self.model_future.result()
[b2rtc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 451, in result
...
...

This goes on and on, but it's the same six lines over and over again. It's also interesting that in spite of the exception in the logs, the serving endpoint state never changes to Error, but remains Ready (Scaling from zero) instead.

My requirements are:

mlflow==2.14.1
cloudpickle==2.0.0
databricks-feature-engineering==0.2.1
databricks-sdk==0.12.0
databricks-vectorsearch==0.22
entrypoints==0.4
langchain-community==0.2.6
langchain==0.2.6
numpy==1.23.5
packaging==23.2
pandas==1.5.3
psutil==5.9.0
pydantic==1.10.6
pyyaml==6.0
requests==2.28.1
tornado==6.1

robbe · ‎07-30-2024

Hi @damselfly20 unfortunately I can't help much with that as I've never worked with RAGs. Are you sure it's the same error though? @NaeemS's and my errors seems to be Java related and yours MLflow related.

Databricks Community

Feature Store Model Serving endpoint

Join Us as a Local Community Builder!

🚀 Weekly Delta (1 - 7 October): A Look Back at This Week’s Top Community Highlights!

🌟 Community Sparks of the Week | September 26 – October 2 🌟

Solution Accelerator Series | #4 - Toxicity Detection for Gaming

Level Up with Databricks Specialist Sessions

Announcing Data Intelligence for Cybersecurity