04-24-2020 04:44 PM
Hello,
Scenario:
Trying to install some python modules into a notebook (scoped to just the notebook) using...
```
dbutils.library.installPyPI("azure-identity")
dbutils.library.installPyPI("azure-storage-blob")
dbutils.library.restartPython()
```
...getting the (unclear) error...
```
org.apache.spark.SparkException: Process List(/local_disk0/pythonVirtualEnvDirs/virtualEnv-34b93f38-5a4f-41eb-a754-f16697cd339c/bin/python, /local_disk0/pythonVirtualEnvDirs/virtualEnv-34b93f38-5a4f-41eb-a754-f16697cd339c/bin/pip, install, azure-storage-blob==12.0.0, --disable-pip-version-check) exited with code 1. Traceback (most recent call last):
--------------------------------------------------------------------------- Py4JJavaError Traceback (most recent call last) <command-3781868905499817> in <module>() 1 dbutils.library.installPyPI("azure-identity") ----> 2dbutils.library.installPyPI("azure-storage-blob", version="12.0.0") 3 dbutils.library.restartPython()
/local_disk0/tmp/1587770610080-0/dbutils.py in installPyPI(self, project, version, repo, extras) 237 def installPyPI(self, project, version = "", repo = "", extras = ""): 238 return self.print_and_return(self.entry_point.getSharedDriverContext() \ --> 239 .addIsolatedPyPILibrary(project, version, repo, extras)) 240 241 def restartPython(self):
/databricks/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py in call(self, *args) 1255 answer = self.gateway_client.send_command(command) 1256 return_value = get_return_value( -> 1257 answer, self.gateway_client, self.target_id, self.name) 1258 1259 for temp_arg in temp_args:
```
Whereas
!pip install -U azure-storage-blob
seems to work just fine. Questions:
1. Why is this?
2. At what scope does
!pip install
install python modules? - Notebook scope
- Library
- Cluster
Thank you!
04-27-2020 08:13 PM
Hi @ericOnline
I also faced the same issue and I eventually found that upgrading the databricks runtime version from my current "5.5 LTS (includes Apache Spark 2.4.3, Scala 2.11)" to "6.5(Scala 2.11,Spark 2.4.5) resolved this issue.
Though the official documentation says that dbutils.library.installPyPI is supported after runtime version 5.1 but that does not seem to be the case here.
Thanks
Ishan
05-07-2020 08:25 AM
Further, I found that dbutils.library.installPyPI is supported for LTS 5.5 DB version. In my case, I had some PyPI packages which I had installed at cluster level. I removed those cluster level PyPI packages and used dbutils.library.installPyPI to install notebook scoped packages. It works fine now.
Join a Regional User Group to connect with local Databricks users. Events will be happening in your city, and you won’t want to miss the chance to attend and share knowledge.
If there isn’t a group near you, start one and help create a community that brings people together.
Request a New Group