โ06-02-2022 01:49 AM
โ06-09-2022 04:13 AM
Hi @Kaniz Fatmaโ ,
Sorry for the late response, Installing deequ-2.0.1-spark-3.2 on cluster solves the issue.
โ06-02-2022 03:07 AM
@Chandan Angadiโ, which library are you using? Are you installing using the libraries tab or are you using an init script to install the library. Is this happening always or intermittent?
โ06-02-2022 04:24 AM
Hi @Prabakar Ammeappinโ ,
I am using set of libraries, which I am installing under a notebook. Not in the cluster config.
python>=3.7,<3.8
findspark=1.3.0
openpyxl=3.0.7
pyarrow=0.14.0
smart_open=5.2.1
xlrd=2.0.1
conda-pack=0.6.0
tqdm=4.62.2
tsfresh=0.17.0
scikit-learn=0.24.2
pip=21.2.4
git=2.33.0
pandas=1.2.5
fsspec=2021.7.0
mlflow==1.20.1
pydeequ==0.1.5
s3fs==2021.8.0
botocore==1.20.106
boto3==1.17.106
pyspark==2.4.7
torch-tb-profiler==0.4.0
pytorch-forecasting==0.9.2
pytorch-lightning==1.4.5
tensorboard==2.8.0
ipyaggrid==0.2.1
jupyter-contrib-nbextensions==0.5.1
fastparquet==0.7.1
plotnine==0.8.0
tslearn==0.5.2
fastdtw==0.3.4
dtaidistance==2.3.2
catboost==1.0.3
shap==0.40.0
k-means-constrained==0.6.0
pdf2image==1.16.0
kaleido==0.2.1
pickle5==0.0.12
Thanks,
Chandan
โ06-02-2022 08:24 AM
By any chance, was the cluster restarted after installing the libraries or was it detached and reattached from/to the notebook? Notebook-scoped libraries do not persist across sessions. You must reinstall notebook-scoped libraries at the beginning of each session, or whenever the notebook is detached from a cluster.
โ06-02-2022 09:52 PM
HI @Prabakar Ammeappinโ ,
Yes, I agree notebook scoped lib do not persist across sessions. I am installing the required lib in the first cell of the notebook and afterward rest of the code will be executed. The notebook is not detached from the cluster.
โ06-03-2022 01:44 AM
We need to debug the logs to understand this further. Is it possible for you to export and share a notebook with minimal repro steps? It will help us to reproduce this in-house and check the logs. Or If you have a support contract, it would be better to raise a support ticket.
โ06-09-2022 04:13 AM
Hi @Kaniz Fatmaโ ,
Sorry for the late response, Installing deequ-2.0.1-spark-3.2 on cluster solves the issue.
โ06-09-2022 04:22 AM
The spark version of Jar file must match with Cluster Spark version
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you wonโt want to miss the chance to attend and share knowledge.
If there isnโt a group near you, start one and help create a community that brings people together.
Request a New Group