06-20-2024 03:03 PM
Hi,
I am trying to deploy my model which was logged by featureStoreEngineering client as a serving endpoint in Databricks. But I am facing following error:
The Databricks Lookup client from databricks-feature-lookup and Databricks Feature Store client from databricks-feature-engineering cannot be installed in the same python environment.
When I log my model using FeatureStoreEngineeringClient databricks-feature-lookup is added by default in my requirements file, and when I load that model in other environment I get this error. Also, when I add databricks-feature-engineering ad dependency I get the same error in my model serving endpoint creation. Also, if I don't add databricks-feature-engineering as a dependency I will get other errors while creation of my serving endpoint.
model_impl = importlib.import_module(conf[MAIN])._load_pyfunc(data_path) [cc8d72wmj6] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflow/spark/__init__.py", line 898, in _load_pyfunc [cc8d72wmj6] spark = _create_local_spark_session_for_loading_spark_model() [cc8d72wmj6] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflow/utils/_spark_utils.py", line 131, in _create_local_spark_session_for_loading_spark_model [cc8d72wmj6] .getOrCreate() [cc8d72wmj6] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/pyspark/sql/session.py", line 497, in getOrCreate [cc8d72wmj6] sc = SparkContext.getOrCreate(sparkConf) [cc8d72wmj6] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/pyspark/context.py", line 515, in getOrCreate [cc8d72wmj6] SparkContext(conf=conf or SparkConf()) [cc8d72wmj6] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/pyspark/context.py", line 201, in __init__ [cc8d72wmj6] SparkContext._ensure_initialized(self, gateway=gateway, conf=conf) [cc8d72wmj6] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/pyspark/context.py", line 436, in _ensure_initialized [cc8d72wmj6] SparkContext._gateway = gateway or launch_gateway(conf) [cc8d72wmj6] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/pyspark/java_gateway.py", line 107, in launch_gateway [cc8d72wmj6] raise PySparkRuntimeError( [cc8d72wmj6] pyspark.errors.exceptions.base.PySparkRuntimeError: [JAVA_GATEWAY_EXITED] Java gateway process exited before sending its port number. [cc8d72wmj6] [2024-06-20 21:04:06 +0000] [10] [ERROR] Error handling request /v2/health/ready I would appreciate any help in solving this issue.
Thanks,
06-21-2024 09:31 AM - edited 06-21-2024 09:54 AM
Hi @Kaniz_Fatma ,
Thanks for your response. But I have limited access to shared cluster and can not use it for this purpose. Can you tell me how I can prevent databricks-feature-lookup from being added in my requirements file when I am logging my model using FeatureStoreEngineeringClient which is causing this issue.
Or any other workaround for solving this issue with single user cluster.
06-23-2024 12:22 PM
Hi @NaeemS , but the issue should not arise from having databricks-feature-lookup in your serving endpoint, quite the opposite. You should have the databricks-feature-lookup dependency but not the databricks-feature-engineering dependency.
Can you please elaborate more on the error that you mentioned in the OP, perhaps including some reproducible code?
06-24-2024 03:14 PM
Hi @robbe ,
I am facing following errors when creating serving endpoint for my spark pipeline
5548dsptvc] raise self._exception
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflowserving/scoring_server/__init__.py", line 194, in get_model_option_or_exit
[5548dsptvc] self.model = self.model_future.result()
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 451, in result
[5548dsptvc] return self.__get_result()
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
[5548dsptvc] raise self._exception
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflowserving/scoring_server/__init__.py", line 194, in get_model_option_or_exit
[5548dsptvc] self.model = self.model_future.result()
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 451, in result
[5548dsptvc] return self.__get_result()
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
[5548dsptvc] raise self._exception
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflowserving/scoring_server/__init__.py", line 194, in get_model_option_or_exit
[5548dsptvc] self.model = self.model_future.result()
[5548dsptvc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 451, in result
[5548dsptvc] return self.__get_result()
I have tested creating a sklearn model in my environment which is working fine, but for the spark pipeline I am unable to do so , following is my requirement file which is being logged with my model.
mlflow==2.14.1
numpy==1.21.5
pyspark==3.5.1
scipy==1.9.1
databricks-feature-lookup==0.*
databricks-feature-lookup 1.2.14 is being installed in my environment as the earlier versions has been deprecated.
Thanks,
06-25-2024 07:56 AM
Hi @NaeemS, it's hard to say given how uninformative the error is. I will try to give it a go next week but maybe you can help me by answering a few questions:
databricks-feature-lookup==0.*from your requirements file?
06-26-2024 10:15 AM
Hi @robbe ,
My model was logged successfully using feature engineering client. The issue appears while creating a serving endpoint for that model.
For the Sklearn model I used a simple Random Forest model and for the spark model I am logging a pipeline with multiple steps like imputer, indexer, assembler and model.
Also, I am not using 3rd part feature stores here. I am using online tables within Unity Catalog, this is relatively a new feature introduced by Databricks. I am using same store with sklearn model and it is working fine, the issue appears only with spark pipeline.
While logging my model I am not specifying databricks-feature-lookup as dependency, it is being added by default with my model even if I give my own requirements file while logging the model.
Thanks
07-22-2024 01:28 AM
Hi @NaeemS, did you manage to find a fix? I tried to run the same setting that you have and Iam running into your same problem and I think I got the reason - JAVA_HOME is not set.
So the Feature Engineering library seems to be messing up with the Java installation. I'll try to look into the issue further.
07-23-2024 11:32 PM
Hi @robbe, I'm facing the same error like @NaeemS. I've deployed an endpoint for a RAG chain in Azure Databricks and at first, it worked well. I've set scale_to_zero_enabled=True. The problem is: Sometimes, scaling up from zero works fine and sometimes it results in an error:
[b2rtc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result [b2rtc] raise self._exception [b2rtc] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflowserving/scoring_server/__init__.py", line 182, in get_model_option_or_exit [b2rtc] self.model = self.model_future.result() [b2rtc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 451, in result [b2rtc] return self.__get_result() [b2rtc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result [b2rtc] raise self._exception [b2rtc] File "/opt/conda/envs/mlflow-env/lib/python3.10/site-packages/mlflowserving/scoring_server/__init__.py", line 182, in get_model_option_or_exit [b2rtc] self.model = self.model_future.result() [b2rtc] File "/opt/conda/envs/mlflow-env/lib/python3.10/concurrent/futures/_base.py", line 451, in result ... ...
This goes on and on, but it's the same six lines over and over again. It's also interesting that in spite of the exception in the logs, the serving endpoint state never changes to Error, but remains Ready (Scaling from zero) instead.
My requirements are:
mlflow==2.14.1 cloudpickle==2.0.0 databricks-feature-engineering==0.2.1 databricks-sdk==0.12.0 databricks-vectorsearch==0.22 entrypoints==0.4 langchain-community==0.2.6 langchain==0.2.6 numpy==1.23.5 packaging==23.2 pandas==1.5.3 psutil==5.9.0 pydantic==1.10.6 pyyaml==6.0 requests==2.28.1 tornado==6.1
07-30-2024 08:08 AM
Hi @damselfly20 unfortunately I can't help much with that as I've never worked with RAGs. Are you sure it's the same error though? @NaeemS's and my errors seems to be Java related and yours MLflow related.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group