cancel
Showing results for 
Search instead for 
Did you mean: 
Community Platform Discussions
Connect with fellow community members to discuss general topics related to the Databricks platform, industry trends, and best practices. Share experiences, ask questions, and foster collaboration within the community.
cancel
Showing results for 
Search instead for 
Did you mean: 

GeoPandas Insall

tomos_phillips1
New Contributor II

hi,

I cannot install geopandas in my notebook, ive tried all different forms of generic fix, pip installs etc but always get this error:

CalledProcessError: Command 'pip --disable-pip-version-check install geopandas' returned non-zero exit status 1.
--------------------------------------------------------------------------- CalledProcessError Traceback (most recent call last) File <command-3215529319294224>, line 3 1 get_ipython().run_line_magic('pip', 'install folium') 2 get_ipython().run_line_magic('pip', 'install shapely') ----> 3 get_ipython().run_line_magic('pip', 'install geopandas') 4 get_ipython().run_line_magic('pip', 'install geopy') 5 get_ipython().run_line_magic('pip', 'install rtree') File /databricks/python/lib/python3.10/site-packages/IPython/core/interactiveshell.py:2369, in InteractiveShell.run_line_magic(self, magic_name, line, _stack_depth) 2367 kwargs['local_ns'] = self.get_local_scope(stack_depth) 2368 with self.builtin_trap: -> 2369 result = fn(*args, **kwargs) 2371 # The code below prevents the output from being displayed 2372 # when using magics with decodator @output_can_be_silenced 2373 # when the last Python token in the expression is a ';'. 2374 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False😞 File /databricks/python_shell/dbruntime/PipMagicOverrides.py:34, in PipMagicOverrides.pip(self, line) 32 @line_magic 33 def pip(self, line): ---> 34 self.pipMagicHandler.runCmd("pip", line) File /databricks/python_shell/dbruntime/PipMagicOverrides.py:60, in PipMagicHandler.runCmd(self, magicCmd, line) 58 print(PYTHON_RESTART_WARNING) 59 if parsedResult.rewrittenCommand(): ---> 60 self.executePipCommand(parsedResult) 61 envManager.postExecute(parsedResult) 62 if parsedResult.isMutation(): 63 # double print this output is at the end so it is more 64 # likely to be seen File /databricks/python_shell/dbruntime/PipMagicOverrides.py:123, in PipMagicHandler.executePipCommand(self, result) 121 sys.stdout.flush() 122 if returncode != 0: --> 123 raise subprocess.CalledProcessError(returncode, origCmd) 124 finally: 125 end = time.time() CalledProcessError: Command 'pip --disable-pip-version-check install geopandas' returned non-zero exit status 1.

note: This error originates from a subprocess, and is likely not a problem with pip. error: subprocess-exited-with-error × Getting requirements to build wheel did not run successfully. │ exit code: 1 ╰─> See above for output. note: This error originates from a subprocess, and is likely not a problem with pip.

Can anyone help me with this issue?
 
Thanks. 
10 REPLIES 10

shan_chandra
Esteemed Contributor
Esteemed Contributor

@tomos_phillips1 - can you please try install geopandas on a single node cluster. (driver only) ?

%pip install geopandas

 

@shan_chandra - I do not have permission in my org to modify any cluster properties etc. Do you know of another way around this? Our internal IT team also do not have a fix for this. 

@tomos_phillips1 - can you please raise a support ticket with Databricks support team to triage this further?

vbvasa
New Contributor II

@shan_chandra - Doesn't work for me in Driver Only Cluster as well

vbvasa_0-1713870495139.png

 

vbvasa
New Contributor II

@tomos_phillips1 @shan_chandra 
Got the below init script from Databricks Support. Worked for us in Databricks AWS Env.

dbutils.fs.put("/databricks/scripts/libinstall.sh","""
#!/bin/bash
sudo rm -r /var/lib/apt/lists/* 
sudo apt clean && 
sudo apt update --fix-missing -y &&
sudo apt install -y libmysqlclient21
sudo apt install -y libgdal-dev
""", True)

Thanks so much for this it worked for me!

Kaniz_Fatma
Community Manager
Community Manager

Hi @vbvasa

  • The error message indicates that a GDAL API version must be specified. You can address this by providing a path to gdal-config using a GDAL_CONFIG environment variable or ...1.
  • To set the GDAL_CONFIG environment variable, you can follow these steps:
    • Locate the path to your GDAL installation (usually in the bin directory).
    • Set the GDAL_CONFIG environment variable to point to the gdal-config executable. For example:
      export GDAL_CONFIG=/path/to/gdal-config
      
  • Alternatively, you can set the GDAL_VERSION environment variable to the desired GDAL version.
  • Consider using Conda instead of pip to install GeoPandas. Conda handles package dependencies more effectively and can help avoid version conflicts.
  • Create a new environment and install GeoPandas within it:
    conda create -n geo-env -c conda-forge geopandas
    
  • If this fails, try updating Conda and then retry the installation
  • Sometimes, manually installing dependencies can resolve issues.
  • Follow these steps:
    • Install the required dependencies one by one:
      pip install cython
      pip install shapely
      pip install fiona
      pip install pyproj
      pip install rtree
      
    • After installing the dependencies, try installing GeoPandas again:
      pip install geopandas
  • Remember to activate the appropriate environment (if using Conda) before attempting the installation.

    Hopefully, one of these approaches will help you successfully install GeoPandas! 😊🌐

I have been having this issue also. Conda may well be better, but how do you use conda in databricks? From what I can see the only ways are to either use runtime 7.3 or below or use completely new type of container. Those are not feasible.

shan_chandra
Esteemed Contributor
Esteemed Contributor

@brian999 - Conda is subjected to commercial licensing. Referenced here:  https://docs.databricks.com/en/archive/legacy/conda.html

As i said in my comment and it says on the page you just sent:

 

Important

%conda commands are deprecated, and are supported only for Databricks Runtime 7.3 LTS ML. Databricks recommends using %pip for managing notebook-scoped libraries. If you require Python libraries that can only be installed using conda, you can use conda-based docker containers to pre-install the libraries you need.

This is not at all a feasible way to use conda.

 

Connect with Databricks Users in Your Area

Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.

If there isn’t a group near you, start one and help create a community that brings people together.

Request a New Group