Library Dependency

isaac_gritz
Databricks Employee
Databricks Employee

How to Install Libraries on Databricks

You can install libraries in Databricks at the cluster level for libraries commonly used on a cluster, at the notebook-level using %pip, or using global init scripts when you have libraries that should be installed on all clusters.

The Databricks ML runtime comes with many commonly used ML libraries pre-installed including sklearn, Tensorflow, and XGBoost (AWS | Azure | GCP)

You can learn more at this page (AWS | Azure | GCP). Let us know in the comments if you have any questions on library dependency!

isaac_gritz
Databricks Employee
Databricks Employee

Thanks Abishek!

Prabakar
Databricks Employee
Databricks Employee

Good post @Isaac Gritz​ and thanks @Abishek Subramanian​ for adding those links.​

LearningDatabri
Contributor II

I saw couple of you posts and it seems that you are sharing the information from the public docs. ​instead of sharing the docs information you can post some errors and the troubleshooting and resolution steps. That will be much helpful. Thanks.

Hi team, I am a Databricks employee and I am posting common questions I have received from Databricks customers in the past and answers I would typically provide. I reached out to @Lindsay Olson​ to validate this approach. Happy to remove any posts or find a new forum/format if you would like.

Chris_Shehu
Valued Contributor III

It can be a risky to install libraries without any sort of oversite/security structure to ensure those libraries have no vulnerabilities. I think more caution needs to be added to the wording of these documents to express that. All of the libraries we use go through a vetting process before they can actually be installed.