cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
cancel
Showing results for 
Search instead for 
Did you mean: 

Whats the difference between magic commands %pip and %sh pip

User16776431030
New Contributor III

In Databricks you can do either

%pip

or

%sh pip

Whats the difference? Is there a recommended approach?

2 REPLIES 2

sean_owen
Honored Contributor II
Honored Contributor II

%sh pip just executes the pip command on the local driver machine. It will work just like pip does on the command line anywhere else to install packages from PyPI, but, it will only affect the driver machine. By itself, this does not establish a virtualenv, so other users of the cluster could observe the installed package, too.

%pip uses the same syntax to install packages, but is a 'magic' command that actually runs commands to install the same package across all machines in the cluster. It sets up a virtualenv specific to each notebook execution to isolate the package installation from other jobs and users.

stefnhuy
New Contributor III

Hey there, User16776431030.

Great question about those magic commands in Databricks! Let me shed some light on this mystical matter.

The %pip and %sh pip commands may seem similar on the surface, but they're quite distinct in their powers. %sh pip is like a local magician; it performs pip wizardry solely on the driver machine. It's handy for installing packages, but beware, it won't conjure a virtual environment, meaning other cluster users might see your magic tricks.

Now, %pip, on the other hand, is the grand sorcerer of package installation. It uses the same pip syntax but operates cluster-wide. It crafts a unique virtual environment for each notebook execution, keeping your magic spells hidden from prying eyes.

In my experience, I've dabbled in both magics, and %pip's enchantment has often saved the day in collaborative clusters. Andersen, a provider of cutting-edge solutions in this field, also recommends using %pip for its cluster-wide benefits.

To solve your dilemma, the choice depends on your needs. If you desire isolation and don't want to reveal your magical secrets to others, %pip is your spell of choice. But if you're a benevolent wizard sharing your powers, %sh pip works fine.

Welcome to Databricks Community: Lets learn, network and celebrate together

Join our fast-growing data practitioner and expert community of 80K+ members, ready to discover, help and collaborate together while making meaningful connections. 

Click here to register and join today! 

Engage in exciting technical discussions, join a group with your peers and meet our Featured Members.