- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-14-2022 02:24 AM
Hi!
When I run a notebook on databricks, it throws error - " 'JavaPackage' object is not callable" which points to pydeequ library:
/local_disk0/.ephemeral_nfs/envs/pythonEnv-3abbb1aa-ee5b-48da-aaf2-18f273299f52/lib/python3.8/site-packages/pydeequ/checks.py in __init__(self, spark_session, level, description, constraints)
91 self._jvm = spark_session._jvm
92 self.level = level
---> 93 self._java_level = self.level._get_java_object(self._jvm)
94 self._check_java_class = self._jvm.com.amzon.deequ.checks.Check
95 self.description = description
/local_disk0/.ephemeral_nfs/envs/pythonEnv-3abbb1aa-ee5b-48da-aaf2-18f273299f52/lib/python3.8/site-packages/pydeequ/checks.py in _get_java_object(self, jvm)
19 return jvm.com.amzon.deequ.checks.CheckLevel.Error()
20 if self == CheckLevel.Warning:
---> 21 return jvm.com.amzon.deequ.checks.CheckLevel.Warning()
22 raise ValueError("Invalid value for CheckLevel Enum")
Spark 3.2.0,
Scala 2.12
I believe it has something to do with my runtime version, but I dont want to downgrade it.
Please help me with this.
Thanks
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-14-2022 07:03 AM
@Direo Direo , https://github.com/awslabs/python-deequ/issues/1
You can try to install the matching versions as others have tried.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-14-2022 07:03 AM
@Direo Direo , https://github.com/awslabs/python-deequ/issues/1
You can try to install the matching versions as others have tried.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
11-17-2023 07:39 AM
Hi. If you are struggling like I was, these were the steps I followed to make it work:
1 - Created a cluster with Runtime 10.4 LTS, which has spark version 3.2.1 (it should work with more recent runtimes, but be aware of the spark version)
2 - When creating the cluster, add through the UI the following libraries:
- Maven library source. Coordinates: com.amazon.deequ:deequ:2.0.1-spark-3.2
- PyPI library source. Package: pydeequ==1.1.1
3 - After the cluster is created, in your notebook add the spark version to the environment variable as described in PyDeeQu's documentation (os.environ["SPARK_VERSION"] = "3.2")
Then, just import pydeequ, and you're ready to go.