cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

hdbscan package error

Itachi_Naruto
New Contributor II

I try to import **hdbscan** but it throws this following error

/databricks/python_shell/dbruntime/PythonPackageImportsInstrumentation/__init__.py in import_patch(name, globals, locals, fromlist, level)
    156             # Import the desired module. If youโ€™re seeing this while debugging a failed import,
    157             # look at preceding stack frames for relevant error information.
--> 158             original_result = python_builtin_import(name, globals, locals, fromlist, level)
    159 
    160             is_root_import = thread_local._nest_level == 1
 
hdbscan/_hdbscan_linkage.pyx in init hdbscan._hdbscan_linkage()
 
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 80 from PyObject

When I update the numpy version to 1.22.0, I am not getting this error, but it fails to import **umap** which tells the numpy version to be <1.20.

To summarize

Databricks runtime version - 10.1 ML (includes Apache Spark 3.2.0, Scala 2.12)

Python Version - 3.8.10

Python packages installed

umap-learn==0.5.1

numpy==1.22.0

hdbscan==0.8.27

(This version throws error while importing umap)

Python packages installed

umap-learn==0.5.1

numpy==1.20.0

hdbscan==0.8.27

(This version throws the above error)

1 ACCEPTED SOLUTION

Accepted Solutions

Kaniz
Community Manager
Community Manager

Hi @Rajamannar Aanjaramโ€‹ , It looks like there's a compatibility issue with the 

hdbscan library.

You may check out the Github issue which addresses a similar issue.

In case the above Github issue doesn't solve your issue, I would request to open a new issue here: https://github.com/scikit-learn-contrib/hdbscan/issues

Screenshot 2022-01-31 at 12.57.00 PM 

Hope this will help. Please let us know if any further queries.

View solution in original post

7 REPLIES 7

Kaniz
Community Manager
Community Manager

Hi @ Itachi_Naruto! My name is Kaniz, and I'm the technical moderator here. Great to meet you, and thanks for your question! Let's see if your peers in the community have an answer to your question first. Or else I will get back to you soon. Thanks.

Kaniz
Community Manager
Community Manager

Hi @Rajamannar Aanjaramโ€‹ , Please try these commands, You'll be able to install hdbscan package.

pip install --upgrade numpy
pip install hdbscan

Itachi_Naruto
New Contributor II

Hi @Kaniz Fatmaโ€‹  โ€‹ thanks for the answer, but when we run the above commands I can't import the umap library

Hi @Rajamannar Aanjaramโ€‹ ,

To use UMAP you need to install umap-learn not umap.

So, in case you've installed umap please run the following commands to uninstall umap and install upam-learn instead:

pip uninstall umap
pip install umap-learn

And then in your python code make sure you are importing the module using:

import umap.umap_ as umap

Instead of

import umap

Atanu
Esteemed Contributor
Esteemed Contributor

does this help @Rajamannar Aanjaramโ€‹ ?

Itachi_Naruto
New Contributor II

Hi @Atanu Sarkarโ€‹ , no this solution didn't work

Kaniz
Community Manager
Community Manager

Hi @Rajamannar Aanjaramโ€‹ , It looks like there's a compatibility issue with the 

hdbscan library.

You may check out the Github issue which addresses a similar issue.

In case the above Github issue doesn't solve your issue, I would request to open a new issue here: https://github.com/scikit-learn-contrib/hdbscan/issues

Screenshot 2022-01-31 at 12.57.00 PM 

Hope this will help. Please let us know if any further queries.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.