cancel
Showing results forย 
Search instead forย 
Did you mean:ย 
Data Engineering
cancel
Showing results forย 
Search instead forย 
Did you mean:ย 

Unable to follow H3 Quickstart

Baldur
New Contributor II

Hello,
I'm following H3 quickstart(Databricks SQL) tutorial because I want to do point-in-polygon queries on 21k polygons and 95B points. The volume is pushing me towards using H3. In the tutorial, they use geopandas.

According to H3 geospatial functions, we need a photon enabled cluster. 

The ML clusters cannot have photon enabled, but so far I have not been able to install geopandas on our regular cluster like they do in the quickstart, I keep getting this error: 

 

 

 

 

%pip install geopandas fsspec --quiet

Note: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages.
  error: subprocess-exited-with-error
  
  ร— Getting requirements to build wheel did not run successfully.
  โ”‚ exit code: 1
  โ•ฐโ”€> [3 lines of output]
      <string>:86: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead
      WARNING:root:Failed to get options via gdal-config: [Errno 2] No such file or directory: 'gdal-config'
      CRITICAL:root:A GDAL API version must be specified. Provide a path to gdal-config using a GDAL_CONFIG environment variable or use a GDAL_VERSION environment variable.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

ร— Getting requirements to build wheel did not run successfully.
โ”‚ exit code: 1
โ•ฐโ”€> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
CalledProcessError: Command 'pip --disable-pip-version-check install geopandas fsspec --quiet' returned non-zero exit status 1.

 

 

 

 


Can I not use H3 and Geopandas? To me the choice seems to either use a ML cluster without photon or use a regular cluster with photon but unable to install geopandas used in the Quickstart guide.

Below is the config for the cluster that fails to install geopandas:

Baldur_0-1698768174045.png

 

3 REPLIES 3

mjohns
New Contributor II
New Contributor II

Hi @Baldur

There is no inherent limitation of installing python libraries on a photon cluster. I see that you have selected DBR 13.3. Can you provide a little more on the cluster configuration, namely is it single user or shared access? And is your workspace configured to use Unity Catalog (might become relevant, not sure)? I just ran: 

%pip install geopandas fsspec
 with no issue on a single user (vs shared access) photon dbr cluster in a workspace with Unity Catalog. Can you attempt again with a single user + photon dbr cluster? If you say that is what you have already tried, then I think you would want to dig in further on any restrictions your company is placing on the environment (beyond Databricks defaults).

`

 

Baldur
New Contributor II

Hi @mjohns thanks for the reply.

To your questions:

  1. The cluster that fails is shared access
  2. Our workspace does have Unity Catalog

Good idea, I attempted this again, creating a Single user - 13.3 LTS cluster with Photon Acceleration enabled and I was able to install Geopandas. This does sound like a config issue on that particular cluster. I'm gonna pursue that.

Thank you for the guidance!

siddhathPanchal
New Contributor III
New Contributor III

Hi @Baldur . I hope that above answer solved your problem. If you have any follow up questions, please let us know. If you like the solution, please do not forget to press 'Accept as Solution' button.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.