Hello,
I'm following H3 quickstart(Databricks SQL) tutorial because I want to do point-in-polygon queries on 21k polygons and 95B points. The volume is pushing me towards using H3. In the tutorial, they use geopandas.
According to H3 geospatial functions, we need a photon enabled cluster.
The ML clusters cannot have photon enabled, but so far I have not been able to install geopandas on our regular cluster like they do in the quickstart, I keep getting this error:
%pip install geopandas fsspec --quiet
Note: you may need to restart the kernel using dbutils.library.restartPython() to use updated packages.
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [3 lines of output]
<string>:86: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead
WARNING:root:Failed to get options via gdal-config: [Errno 2] No such file or directory: 'gdal-config'
CRITICAL:root:A GDAL API version must be specified. Provide a path to gdal-config using a GDAL_CONFIG environment variable or use a GDAL_VERSION environment variable.
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.
note: This error originates from a subprocess, and is likely not a problem with pip.
CalledProcessError: Command 'pip --disable-pip-version-check install geopandas fsspec --quiet' returned non-zero exit status 1.
Can I not use H3 and Geopandas? To me the choice seems to either use a ML cluster without photon or use a regular cluster with photon but unable to install geopandas used in the Quickstart guide.
Below is the config for the cluster that fails to install geopandas: