cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

%pip/%conda doesn't work with encrypted clusters starting DBR 9.x

Prabakar
Esteemed Contributor III
Esteemed Contributor III

While trying to use the magic command %pip/%conda with DBR 9.x or above it fails with the following error:

%pip install numpy

org.apache.spark.SparkException: %pip/%conda commands use unencrypted NFS and are disabled by default when SSL encryption is enabled. NFS can be safely used to install libraries that do not contain PHI or other sensitive data, such as open source packages. %pip/%conda commands or NFS should not be used to transmit PHI to Spark workers. To enable %pip/%conda commands, set spark.databricks.conda.ignoreSSL to true in Spark config in cluster settings and restart your cluster.

How to resolve this?

1 ACCEPTED SOLUTION

Accepted Solutions

Prabakar
Esteemed Contributor III
Esteemed Contributor III

The error message says to enable %pip/%conda commands, set spark.databricks.conda.ignoreSSL to true in Spark config in cluster settings and restart your cluster.

But if you try setting this value it doesn’t work with %pip commands. 

To enable %pip we need to set spark.databricks.pip.ignoreSSL to true in Spark config.

This is something simple that we forget to look at. I wanted to share this as not to blindly rely on the message mentioned in the error. Investigate a little before we go with the fix.

View solution in original post

2 REPLIES 2

Prabakar
Esteemed Contributor III
Esteemed Contributor III

The error message says to enable %pip/%conda commands, set spark.databricks.conda.ignoreSSL to true in Spark config in cluster settings and restart your cluster.

But if you try setting this value it doesn’t work with %pip commands. 

To enable %pip we need to set spark.databricks.pip.ignoreSSL to true in Spark config.

This is something simple that we forget to look at. I wanted to share this as not to blindly rely on the message mentioned in the error. Investigate a little before we go with the fix.

Prabakar
Esteemed Contributor III
Esteemed Contributor III

If you are not aware of the traffic encryption between cluster worker nodes, you can refer to the below link.

https://docs.microsoft.com/en-us/azure/databricks/security/encryption/encrypt-otw