cancel
Showing results for 
Search instead for 
Did you mean: 
Data Engineering
Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
cancel
Showing results for 
Search instead for 
Did you mean: 

Forum Posts

kidexp
by New Contributor II
  • 21865 Views
  • 7 replies
  • 2 kudos

Resolved! How to install python package on spark cluster

Hi, How can I install python packages on spark cluster? in local, I can use pip install. I want to use some external packages which is not installed on was spark cluster. Thanks for any suggestions.

  • 21865 Views
  • 7 replies
  • 2 kudos
Latest Reply
Mikejerere
New Contributor II
  • 2 kudos

If --py-files doesn’t work, try this shorter method:Create a Conda Environment: Install your packages.conda create -n myenv python=3.xconda activate myenvpip install your-packagePackage and Submit: Use conda-pack and spark-submit with --archives.cond...

  • 2 kudos
6 More Replies
Confused
by New Contributor III
  • 33211 Views
  • 6 replies
  • 3 kudos

Resolved! Configuring pip index-url and using artifacts-keyring

Hi I would like to use the azure artifact feed as my default index-url when doing a pip install on a Databricks cluster. I understand I can achieve this by updating the pip.conf file with my artifact feed as the index-url. Does anyone know where i...

  • 33211 Views
  • 6 replies
  • 3 kudos
Latest Reply
murtazahzaveri
New Contributor II
  • 3 kudos

For Authentication you can provide below config on cluster's Spark Environment Variables,PIP_EXTRA_INDEX_URL=https://username:password@pkgs.sample.com/sample/_packaging/artifactory_name/pypi/simple/.Also, you can store the value in Databricks secret

  • 3 kudos
5 More Replies
William_Scardua
by Valued Contributor
  • 2873 Views
  • 3 replies
  • 1 kudos

Magic Pip Install Error

Hi guys,I receive that erro when try to use pip install, have any idea ?CalledProcessError Traceback (most recent call last) <command-3492276838775365> in <module> ----> 1 get_ipython().run_line_magic('pip', 'install /dbfs/File...

  • 2873 Views
  • 3 replies
  • 1 kudos
Latest Reply
Bartosz
New Contributor II
  • 1 kudos

Hi @William_Scardua !I changed the cluster runtime to 10.4 LTS and the error disappeared. Just letting you know, maybe it will help you too!Cheers!

  • 1 kudos
2 More Replies
tj-cycyota
by Databricks Employee
  • 8169 Views
  • 2 replies
  • 1 kudos

Whats the difference between magic commands %pip and %sh pip

In Databricks you can do either %pipor %sh pipWhats the difference? Is there a recommended approach?

  • 8169 Views
  • 2 replies
  • 1 kudos
Latest Reply
stefnhuy
New Contributor III
  • 1 kudos

Hey there, User16776431030.Great question about those magic commands in Databricks! Let me shed some light on this mystical matter.The %pip and %sh pip commands may seem similar on the surface, but they're quite distinct in their powers. %sh pip is l...

  • 1 kudos
1 More Replies
Aritra
by New Contributor II
  • 1777 Views
  • 4 replies
  • 0 kudos

Git repo cloning on Databricks

i am running into issues importing the scalable-machine-learning-with-apache-spark library into databricks. specifically, cloning from git library or %pip install from git library directly to Databricks. Any help is appreciated

  • 1777 Views
  • 4 replies
  • 0 kudos
Latest Reply
Vidula
Honored Contributor
  • 0 kudos

Hi @Aritra Guha​ Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks...

  • 0 kudos
3 More Replies
kpendergast
by Contributor
  • 4646 Views
  • 6 replies
  • 4 kudos

Resolved! Hyperleaup to push data to Tableau Server

Would any care to share how they got the Hyperleaup library working I am currently stuck at an error at publish and cannot seem to find a solution. TypeError: publish() got an unexpected keyword argument 'file_path'I am %pip installing all the requir...

  • 4646 Views
  • 6 replies
  • 4 kudos
Latest Reply
badnishant79
New Contributor II
  • 4 kudos

Hi. Yes dashboard includes multiple filters but only uploaded dashboard on server without any other sheets. I am looking into the extract that other users have suggested. Thanks.

  • 4 kudos
5 More Replies
isaac_gritz
by Databricks Employee
  • 3355 Views
  • 6 replies
  • 8 kudos

Library Dependency

How to Install Libraries on DatabricksYou can install libraries in Databricks at the cluster level for libraries commonly used on a cluster, at the notebook-level using %pip, or using global init scripts when you have libraries that should be install...

  • 3355 Views
  • 6 replies
  • 8 kudos
Latest Reply
Chris_Shehu
Valued Contributor III
  • 8 kudos

It can be a risky to install libraries without any sort of oversite/security structure to ensure those libraries have no vulnerabilities. I think more caution needs to be added to the wording of these documents to express that. All of the libraries w...

  • 8 kudos
5 More Replies
Prabakar
by Databricks Employee
  • 6607 Views
  • 2 replies
  • 5 kudos

Resolved! %pip/%conda doesn't work with encrypted clusters starting DBR 9.x

While trying to use the magic command %pip/%conda with DBR 9.x or above it fails with the following error:%pip install numpyorg.apache.spark.SparkException: %pip/%conda commands use unencrypted NFS and are disabled by default when SSL encryption is ...

  • 6607 Views
  • 2 replies
  • 5 kudos
Latest Reply
Prabakar
Databricks Employee
  • 5 kudos

If you are not aware of the traffic encryption between cluster worker nodes, you can refer to the below link.https://docs.microsoft.com/en-us/azure/databricks/security/encryption/encrypt-otw

  • 5 kudos
1 More Replies
MoJaMa
by Databricks Employee
  • 918 Views
  • 1 replies
  • 0 kudos
  • 918 Views
  • 1 replies
  • 0 kudos
Latest Reply
MoJaMa
Databricks Employee
  • 0 kudos

Hosting your own internal PyPI mirror. That will allow you to manage and approve packages vs directly downloading from public PyPI and then also would remove dependency on an external serviceUpload all wheel files to DBFS, maybe through a CI/CD proce...

  • 0 kudos
EricThomas
by New Contributor
  • 11470 Views
  • 2 replies
  • 0 kudos

!pip install vs. dbutils.library.installPyPI()

Hello, Scenario: Trying to install some python modules into a notebook (scoped to just the notebook) using...``` dbutils.library.installPyPI("azure-identity") dbutils.library.installPyPI("azure-storage-blob") dbutils.library.restartPython()``` ...ge...

  • 11470 Views
  • 2 replies
  • 0 kudos
Latest Reply
eishbis
New Contributor II
  • 0 kudos

Hi @ericOnline I also faced the same issue and I eventually found that upgrading the databricks runtime version from my current "5.5 LTS (includes Apache Spark 2.4.3, Scala 2.11)" to "6.5(Scala 2.11,Spark 2.4.5) resolved this issue. Though the offic...

  • 0 kudos
1 More Replies
Labels