by
Direo
• Contributor II
- 8118 Views
- 2 replies
- 3 kudos
Hi!When I run a notebook on databricks, it throws error - " 'JavaPackage' object is not callable" which points to pydeequ library:/local_disk0/.ephemeral_nfs/envs/pythonEnv-3abbb1aa-ee5b-48da-aaf2-18f273299f52/lib/python3.8/site-packages/pydeequ/che...
- 8118 Views
- 2 replies
- 3 kudos
Latest Reply
Hi. If you are struggling like I was, these were the steps I followed to make it work:1 - Created a cluster with Runtime 10.4 LTS, which has spark version 3.2.1 (it should work with more recent runtimes, but be aware of the spark version)2 - When cre...
1 More Replies
- 2534 Views
- 3 replies
- 2 kudos
I'm using PyDeequ data quality checks in one of our jobs. After adding this check, I noticed that the job does not complete and keeps running indefinitely after PyDeequ checks are completed and results are returned.As stated in Pydeequ documentation ...
- 2534 Views
- 3 replies
- 2 kudos
Latest Reply
Hm, deequ certainly works as I have read about multiple people using it.And when reading the issues (open/closed) on the github pages of pydeequ, databricks is mentioned in some issues so it might be possible after all.But I think you need to check y...
2 More Replies
- 7955 Views
- 4 replies
- 2 kudos
Hi everyone,I want to do some tests regarding data quality and for that I pretend to use PyDeequ on a databricks notebook. Keep in mind that I'm very new to databricks and Spark.First I created a cluster with the Runtime version "10.4 LTS (includes A...
- 7955 Views
- 4 replies
- 2 kudos
Latest Reply
I assumed I wouldn't need to add the Deequ library. Apparently, all I had to do was add it via Maven coordinates and it solved the problem.
3 More Replies