Join discussions on data engineering best practices, architectures, and optimization strategies within the Databricks Community. Exchange insights and solutions with fellow data engineers.
I have installed the library via PyPI on the cluster. When we import the package on notebook, getting the following errorimport librosaOSError: cannot load library 'libsndfile.so': libsndfile.so: cannot open shared object file: No such file or direct...
If anybody ends up here after 2024: the init file must now be placed in the workspace for the cluster to accept it.So in Workspace, use Create/File to create the init script.Then add it to the cluster config inCompute - Your cluster - Advanced Config...
Hi, Im facing an issue while writing to Salesforce sandbox from Databricks. I have installed the "spark-salesforce_2.12-1.1.4" library and my code is as follows:-df_newLeads.write\ .format("com.springml.spark.salesforce")\ .option("username...
I made a function that used the code below and returned url, connectionProperties, sfwriteurl ="https://login.salesforce.com/"dom = url.split('//')[1].split('.')[0]session_id, instance = SalesforceLogin(username=connectionProperties['name'], password...
i tried to install RMySQL on databricks like this:install.packages("RMySQL")i got this error:Installing package into ‘/local_disk0/.ephemeral_nfs/envs/rEnv-c677bc4c-e6a3-40df-a5ab-bfd5d277e0c0’ (as ‘lib’ is unspecified) Warning: unable to access inde...
Hi @miru miro Thank you for posting your question in our community! We are happy to assist you.To help us provide you with the most accurate information, could you please take a moment to review the responses and select the one that best answers you...
If possible, how can one go about installing a Python library with SDK dependencies like pyRFC? (https://github.com/SAP/PyRFC)The SDK dependencies depend on the type of OS, and since we're running Databricks out of AWS, I assume one would have to mat...
Hi @Wonseok Choi Thank you for your question! To assist you better, please take a moment to review the answer and let me know if it best fits your needs.Please help us select the best solution by clicking on "Select As Best" if it does.Your feedback...
I would like to install a library that is under the /Workspace/Shared/ directory using the init.sh script in a cluster. How to access the /Workspace/Shared/ folder in shell? This page only shows how to access manually but doesn't show how to access i...
Hi @Juned Mala Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love to hear from you.Thanks!
In continuation to the issues encountered in this discussion.https://community.databricks.com/s/feed/0D58Y00009tCiQTSA0 I have a bizzare issue.Here are the 2 screenshots taken few seconds apart1.2 . Same cluster, same command, executed 6 seconds apar...
Hi @Ayush Modi I'm sorry you could not find a solution to your problem in the answers provided.Our community strives to provide helpful and accurate information, but sometimes an immediate solution may only be available for some issues.I suggest pr...
I'm trying to install a python library but I'm not able, the status won't change from "pending". I get this message when I click on the library under the cluster's Libraries tab: "Library installation has been attempted on the driver node but has not...
Ok, looks like I was able to solve my problem.First, I needed to install all the required libraries one by one. These are the followings:pandassixrequestspyspnegocryptographykrb5requests-kerberosAfter that I was able to install the webAPI library.
Hi I'm facing an issue when writing to a salesforce object. I'm using the springml/spark-salesforce library. I have the above libraries installed as recommended based on my research.I try to write like this:(_sqldf .write .format("com.springml.spar...
Hello,I have an issue with the import of a custom library, in Azure Databricks.(roughly) 95% of the times it works fine, but sometimes it fails.I searched the internet and this community with no luck, so far.It is a scala library in a scala notebook,...
Even I also encountered the same error. While Importing a file getting an error as "Import failed with error: Could not deserialize: Exceeded 16777216 bytes (current = 16778609)"
I read this article and I created a notebook to use like a library but when I tried to import it in other notebook I received this error: No module named 'lib.lib_test' No module named 'lib.lib_*****'
Hey @Marco Antônio de Almeida Fernandes Hope all is well! Just wanted to check in if you were able to resolve your issue and would you be happy to share the solution or mark an answer as best? Else please let us know if you need more help. We'd love...
What version of Spark, Python, Scala, R are included in each Databricks Runtime? What libraries are pre-installed?You can find this info at the Databricks runtime releases page (AWS | Azure | GCP).Let us know if you have any additional questions on t...
Attempting to install SparkR to the cluster and successfully installed other packages such as tidyverse via CRAN. The error is copied below, any help you can provide is greatly appreciated!Databricks runtime 10.4 LTSLibrary installation attempted on ...
Hi @Ross Hamilton ,I believe SparkR comes inbuilt with Databricks RStudio and you don't have to install it explicitly. You can directly import it with library(SparkR) and it works for you from your above comment.The error message you see could be re...
Hello,We are having issues installing the pdpbox library on a fresh cluster. This includes trying to upload and install a whl file, or using pip in a workbook. I have attached an example of an error received. Can anybody assist with installing the...
PDPbox is updated rarely, and it requires older versions of matplotlib (3.1.1)https://github.com/SauceCat/PDPboxIt tries to install but fails because matplotlib requires pkgconfig.The solution to that is to use Machine Learning runtime. There it will...
While trying to install ffmpeg package using an init script on Databricks cluster, it fails with the below error.Init script:#! /bin/bash
set -e
sudo apt-get update
sudo apt-get -y install ffmpegError message:E: Failed to fetch http://security.ubuntu...
Cause: The VMs are pointing to the cached old mirror which is not up-to-date. Hence there is a problem with downloading the package and it's failing. Workaround: Use the below init script to install the package "ffmpeg". To revert to the original lis...
We have created our own artifactory and we use this to install python dependencies or libraries.We would like to know how we can make use of our own artifactory to install dependencies or libraries on Databricks clusters..
For private repos, you can find some good examples herehttps://kb.databricks.com/clusters/install-private-pypi-repo.htmlhttps://towardsdatascience.com/install-custom-python-libraries-from-private-pypi-on-databricks-6a7669f6e6fd