hdbscan package error

Itachi_Naruto
New Contributor II

I try to import **hdbscan** but it throws this following error

/databricks/python_shell/dbruntime/PythonPackageImportsInstrumentation/__init__.py in import_patch(name, globals, locals, fromlist, level)
    156             # Import the desired module. If you’re seeing this while debugging a failed import,
    157             # look at preceding stack frames for relevant error information.
--> 158             original_result = python_builtin_import(name, globals, locals, fromlist, level)
    159 
    160             is_root_import = thread_local._nest_level == 1
 
hdbscan/_hdbscan_linkage.pyx in init hdbscan._hdbscan_linkage()
 
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 80 from PyObject

When I update the numpy version to 1.22.0, I am not getting this error, but it fails to import **umap** which tells the numpy version to be <1.20.

To summarize

Databricks runtime version - 10.1 ML (includes Apache Spark 3.2.0, Scala 2.12)

Python Version - 3.8.10

Python packages installed

umap-learn==0.5.1

numpy==1.22.0

hdbscan==0.8.27

(This version throws error while importing umap)

Python packages installed

umap-learn==0.5.1

numpy==1.20.0

hdbscan==0.8.27

(This version throws the above error)

Itachi_Naruto
New Contributor II

Hi @Kaniz Fatma​  ​ thanks for the answer, but when we run the above commands I can't import the umap library

Atanu
Databricks Employee
Databricks Employee

does this help @Rajamannar Aanjaram​ ?

Itachi_Naruto
New Contributor II

Hi @Atanu Sarkar​ , no this solution didn't work